Friday, June 24, 2016

SQL Server trace flag 8721 - auto stats sync & async update info

Aaron Morelli (@sqlcrossjoin on Twitter) has a great collection of trace flags in the pdf linked below.  (Check to see if there's a more recent update, v6 was current as of April 2016.)
https://sqlcrossjoin.files.wordpress.com/2016/04/sqlcrossjoin_traceflagrepository_v6.pdf

Trace flag 8721 is of interest to me when investigating ETL behavior.  A complex ETL could change stats numerous times for lots of stats keys.  Information for auto stats updates in addition to stats updates performed explicitly by ETL code can really help complete the picture of what happens during an investigation.

I found this post from Prince Rastogi where he walks through a demo of trace flag 8721 by creating a new table with a stats key, increasing the rows from 0 to 1, then executing a query that requests the stats.  Here's an important note: use trace flag 3605 (redirect some output to SQL Server log) together with trace flag 8721.
http://www.sqlservergeeks.com/sql-server-trace-flag-8721/

I had two questions, and answered them both by following the same process.
My first question: Does it still work in SQL Server 2016? Yep.
Second question: Does it capture both async & sync auto stats updates? Yep.

Snippet from the SQL Server error log below.

*****
Date,Source,Severity,Message
06/24/2016 12:16:22,spid23s,Unknown,AUTOSTATS: UPDATED Stats: test_Tflag8721..PK__test_Tfl__1B434632D45E0816 Dbid = 7 Indid = 1 Rows: 1 Duration: 0ms
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:11,spid57,Unknown,The error log has been reinitialized. See the previous log for older entries.
06/24/2016 12:16:11,spid57,Unknown,Default collation: SQL_Latin1_General_CP1_CI_AS (us_english 1033)
06/24/2016 12:16:11,spid57,Unknown,The service account is 'NT Service\MSSQLSERVER'. This is an informational message; no user action is required.
06/24/2016 12:16:11,spid57,Unknown,Logging SQL Server messages in file 'M:\MSSQL\Log\ERRORLOG'.
06/24/2016 12:16:11,spid57,Unknown,Authentication mode is MIXED.
06/24/2016 12:16:11,spid57,Unknown,System Manufacturer: 'VMware<c/> Inc.'<c/> System Model: 'VMware Virtual Platform'.
06/24/2016 12:16:11,spid57,Unknown,Server process ID is 1488.
06/24/2016 12:16:11,spid57,Unknown,All rights reserved.
06/24/2016 12:16:11,spid57,Unknown,(c) Microsoft Corporation.
06/24/2016 12:16:11,spid57,Unknown,UTC adjustment: -5:00
06/24/2016 12:16:11,spid57,Unknown,Microsoft SQL Server 2016 (CTP3.0) - 13.0.700.242 (X64) <nl/>    Oct 26 2015 18:46:06 <nl/>    Copyright (c) Microsoft Corporation<nl/>    Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)
06/24/2016 12:16:11,spid57,Unknown,Attempting to cycle error log. This is an informational message only; no user action is required.
06/24/2016 12:15:52,spid57,Unknown,Setting database option AUTO_UPDATE_STATISTICS_ASYNC to ON for database 'SEASIDE'.
06/24/2016 12:12:56,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2057019055 Rows: 1.000000 Threshold: 500 Duration: 2ms
06/24/2016 12:12:56,spid57,Unknown,AUTOSTATS: UPDATED Stats: test_Tflag8721..PK__test_Tfl__1B434632ECB10C7B Dbid = 7 Indid = 1 Rows: 1 Duration: 0ms
06/24/2016 12:12:50,spid57,Unknown,The error log has been reinitialized. See the previous log for older entries.
06/24/2016 12:12:50,spid57,Unknown,Default collation: SQL_Latin1_General_CP1_CI_AS (us_english 1033)
06/24/2016 12:12:50,spid57,Unknown,The service account is 'NT Service\MSSQLSERVER'. This is an informational message; no user action is required.
06/24/2016 12:12:50,spid57,Unknown,Logging SQL Server messages in file 'M:\MSSQL\Log\ERRORLOG'.
06/24/2016 12:12:50,spid57,Unknown,Authentication mode is MIXED.
06/24/2016 12:12:50,spid57,Unknown,System Manufacturer: 'VMware<c/> Inc.'<c/> System Model: 'VMware Virtual Platform'.
06/24/2016 12:12:50,spid57,Unknown,Server process ID is 1488.
06/24/2016 12:12:50,spid57,Unknown,All rights reserved.
06/24/2016 12:12:50,spid57,Unknown,(c) Microsoft Corporation.
06/24/2016 12:12:50,spid57,Unknown,UTC adjustment: -5:00
06/24/2016 12:12:50,spid57,Unknown,Microsoft SQL Server 2016 (CTP3.0) - 13.0.700.242 (X64) <nl/>    Oct 26 2015 18:46:06 <nl/>    Copyright (c) Microsoft Corporation<nl/>    Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)
*****

Tuesday, June 14, 2016

SSIS "buffers spooled" on a NUMA server


 So this happened today during an ETL.



A quick look through the InterTubes tells me that SSIS should spool buffers to disk only if
  • memory allocation failed
  • low memory resource notification sent by Windows OS

So the server had plenty of free RAM.  But NUMA node 1 was in a pinch.  And SSIS spooled its buffers to disk.  Doggone it.

I guess I'd figured that notifications were sent based on server-wide memory state.  But I guess maybe memory state on each NUMA node can lead to a memory notification?

The target SQL Server instance, a beefy one, was also on this physical server.  There's 1.5 TB of RAM on the server.  :-)

We lowered max server memory by 100GB, hoping to get a reasonable amount of free memory on NUMA node 1 for tomorrow's ETL.  (The server was also rebooted for an unrelated reason.)

I dunno though.  This is SQL Server 2014 SP1 CU2.   

Memory is already being consumed unevenly across the NUMA nodes.  If NUMA node 1 gets short end of the stick by 6:00 am tomorrow, I think for now we may try to affinitize SSIS to NUMA node 2, in hopes that its memory allocations will come from that node and it that memory allocations will continue to allow enough free memory there even while short-changing NUMA node 1.  

*****Update*****
A blog post showing use of Windows Resource Manager to affinitize SSIS.  But... this feature was deprecated in Windows Server 2012 R2. 
:-( Sad trombone.  

Maybe next version of SSIS can support affinity natively, similar to SQL Server database engine?  Or maybe SSIS can be made NUMA aware?  Or...

Well, at any rate, for maximum scalability and flexibility, gotta find some way of making this work...




Monday, June 13, 2016

L'esprit de L'escalier - Whither SQL Server Backup Buffers?

I've got a long, long list of interesting questions.  And I blog frightfully slowly :-)

So... why don't I start blogging more of the questions?  Someone else may very well be able to springboard from one of my questions to a *really* astounding answer!  Maybe.  I hope my questions are that good anyway :-)

Sometimes the questions come to me as a form of L'esprit de L'escalier - not something I wish I'd said but a question I wish I had asked.

So it is with my question: "Whither SQL Server Backup Buffers?"

Yesterday (or maybe really early this morning) I posed the question: "Whence SQL Server Backup Buffers?"

Here's what I noticed: 'Target server memory' 33553968 hadn't been achieved.  'Total server memory' during the backup was 14947794.  Of that 4194496 was allocated to memoryclerk_backup.

When the backup was cancelled, the memoryclerk_backup allocation was simply returned to the OS.

Why not give it to free memory?  It should have been nice, contiguous chunks of memory addresses.  The server was far from reaching target server memory.  The server wasn't under memory pressure.  This is what it looks like right now... and it would have looked nearly the same right after the backup was cancelled.
That's a lot of free memory.  Take away 4 GB from it for the backup buffers... yeah, its still a lot of free memory.

I wonder:
-If 'max server memory' wasn't being overridden by a more reasonable target (because max server memory is 120 GB on a 32GB VM), would the behavior still be the same before reaching target?  I bet it would be.
-Is this behavior specific to not having reached target?  Or when reaching target would backup buffers be allocated, potentially triggering a shrink of the bpool, then returned to the OS afterward requiring the bpool to grow again?
-What other allocations are returned directly to the OS rather than given back to the memory manager to add to free memory?  I bet CLR does this, too.  Really large query plans?  Nah, I bet they go back into the memory manager's free memory.
-Does this make a big deal?  I bet it could.  Especially if a system is prone to develop persistent foreign memory among its NUMA nodes.  In most cases, it probably wouldn't matter.

Maybe I'll get around to testing some of this.  But it probably won't be for a while.

Whence SQL Server Backup Buffers?

Recently I saw a question on Twitter #sqlhelp about backup buffers, asking if they came from the bpool or MTL.

I think the question is trickier than first realized :-)

MTL (or mem-to-leave) is at this point more of an historic term than anything else.  This term was most relevant in the days of 32-bit OS.  You can search old blog posts (not of mine) and old forum posts to see tons of fun related to 32-bit Windows and attempting to ensure that multi-page allocations and Windows direct allocations had enough available memory in the first 4 GB of address space. (Hunt around for "-g" startup option and "/3GB" switch to *really* appreciate 64-bit Windows & SQL Server.)

Next, SQL Server 2012 saw a very significant change in SQL Server memory management.  Previous to SQL Server 2012, there was both a single page allocator (SPA) and a multi-page allocator (MPA) in SQL Server.  While single page allocations were under governance of "max server memory", multiple page allocations were not.  SQL Server 2012 instead saw the introduction of the any page allocator, which took over the jobs of both previous allocators.  On a related note, the multi-page allocations which were previously outside of "max server memory" control were moved under control of "max server memory".  (I'm really glad that happened - in general it made server behavior much more predictable for a given configuration.)

Now about the idea of "coming from the buffer pool."  SQL Server still uses the term "stolen memory" to describe behavior and within its memory accounting.  But in SQL Server 2012 and beyond, the bpool is a consumer of memory from the memory manager in much the same way as others.  So although the bpool may shrink in order that another consumer can grow, that's a process managed by the memory manager.  And the bpool is usually on the "need to shrink" side only because it tends to be one of the largest, if not the largest, consumers.

OK... so now I'll mention what I *think* the question really meant.  I think the question was meant to determine:
"Are backup buffers governed by 'max server memory', or should I plan for backup buffers outside of 'max server memory' like we still do for SQL Server worker threadstacks?"

That's a pretty good question.  If you search around, you may find a number of posts where the conclusion is... "I don't remember if its included in max server memory or not." :-)

Let's see what can be learned with a quick little experiment.

Don't try this at home.  :-)

Only if you have exclusive access to the memory resources on the server AND if a significant read load on the storage won't disrupt anyone.

What will happen if I run a backup to DISK='NUL' on a very large database with buffercount = 1024 and maxtransfersize = 4194304 (4 mb)?  Altogether, that should be 4 GB of memory allocation - should be quite noticeable.

I'm testing on SQL Server 2016 RC2.

Here's how things looked before the backup.

   
Kinda boring.  Memoryclerk_backup has nothing in node 0 or node 64.
'Total Server Memory (KB)'  is just a hair over 10GB, even though 'Target Server Memory (KB)' is just shy of 32 GB.  'Max server memory' is actually 120000 MB.  There hasn't been much happening in this instance for a few days.

Lets fire up a backup.
The test database is several terabytes, but in this case I'm writing the backup to NUL and I'll cancel the backup as soon as I learn what I want to know.

How do the backup memory clerks and performance counters look now?




Memoryclerk_backup on node 0 now has just over 4 GB allocated to it.  Total server memory has increased by... let's see... 14947744 KB - 10753248 KB =  4194496 KB.  That's exactly the size of the allocation to the backup clerk.

What about stolen memory?  It increased by 5457832 KB - 1257344 KB = 4200488 KB.  That's a little bit more than what memoryclerk_backup picked up.  In part because I opened a second query window while testing :-)

Remember - the memory isn't really stolen from the buffer pool necessarily.  Notice that 'Database Cache Memory (KB)' is the same 9467600 KB before and during the backup.  'Free Memory (KB)' has declined by 28304 KB - 22312 KB = 5992 KB.

Doesn't that number look suspicious?

4194496 KB (backup allocation) + 5992 KB (additional stolen memory) = 4200488 KB

Whew! Glad that worked out :-)

So some memory really was stolen.  But it wasn't stolen from the bpool, which remained the same size during the backup (until I canceled it) as it was before the backup.  The additional stolen memory came from the free memory.

Now, because target memory in this SQL Server instance is so much lower than 'max server memory' (my VM got shrunk from 128GB RAM to 32 GB RAM and I never adjusted 'max server memory'), I can't say that I proved that backup buffers will always be from within the allocation governed by 'max server memory'.  Can I say it's left as an exercise for the reader?

It's late and I gotta get some sleep...

:-)

Wednesday, May 4, 2016

SQL Server: A quick word about large sorts/hashes, lookups/sec, and CPU utilization

Several years ago I began trying to model SQL Server DSS system behavior based on activity rates and resource utilization.  Its an ongoing effort :-)  This is the motivation for many of my strange-looking perfmon graphs - I'm trying to fit it all together.

One of my first stops was to graph CPU utilization against "page lookups" across many systems.
I quickly learned that spinlock contention could disrupt the correlation between CPU utilization and page lookups or database logical IO.  I was suspecting that anyway - I'd seen the same effect on another database platform.

It stands to reason that high rates of plan compile, or lots of XML parsing and manipulation, or usage of CPU by CLR, or CPU consumed by something other than SQL Server will likewise disrupt the correlation.  Operations using worktables can also throw things for a loop, whether reviewing perfmon or results from SET STATISTICS IO ON.

But today I want to mention another disrupter of the relationship between CPU utilization and database logical IO: significant query use of large sorts and hashes.

Consider the following perfmon from a DSS system(96 GB RAM on the system).  Page Lookups/second in blue against the left vertical axis, CPU utilization in orange against the right vertical axis.  I've color-coded this observation period into green, yellow and red portions of time.  During the initial green period of time, there is a strong correlation between CPU utilization and page lookups.  That gives way to the yellow time period, where the relationship between CPU and logical IO comes unglued.  Finally, in the red section the amount of CPU utilized is completely disproportionate to database logical IO.



I don't want to bore you with my "home movie" of the dozens of perfmon graphs I build when I review systems.  :-)  Let's cut to the chase.

 

The graph above has tempdb footprint (light blue) stacked on top of used query memory (dark blue) against the left vertical axis.  The green period has very limited use of query memory.  During the yellow period, a moderate amount of query memory was used.  During the red period, a large amount of query memory was used and at a number of points operations spilled into tempdb.  As query memory was used more extensively, the CPU:lookups/sec correlation was more disrupted.

Once fully considered, this makes sense: query memory is "stolen" from the database page buffer pool.  References to pages in the page pool are "page lookups", but each time stolen query memory is poked and prodded... well, that's not a page lookup.  But it has CPU cost.

So if you are reviewing a system that usually has high correlation between CPU utilization and page lookups/second and suddenly the relationship is disrupted: yeah, could be spinlock contention.  Could be lots of other things, too.  Might be large sorts/hashes dominating memory traffic and less reliance on the database cache where "page lookups" are tracked.

OK... I guess that wasn't really a "quick word".

I tried; not sorry. :-)

I got a hunch about SQL Server Fragmentation: Part VI

***Image quality in this post is horrible, I apologize.  I'll replace later.


Its been a while... but I'm feeling a hunch come on again...

I'm working in a database with a single mdf data file.  (That's important context when looking at results of sys.dm_db_database_page_allocations.)

Let's create my favorite kind of two-column demo table - clustered index key on an integer and an uncompressed varchar(4096).  That forces one row per CI page.

Then insert into the table - in reverse numeric order, key values from 4084 to 1 and a varchar just to take up space.



 How does that look in terms of fragmentation, by the most commonly used measure?  99.9755%, huh?  (See part I of my musings "I got a hunch about SQL Server Fragmentation Part I" at http://sql-sasquatch.blogspot.com/2015/08/i-got-hunch-about-sql-server.html )



OK, now lets delete every row with an odd-numbered clustered index value.
 

How does fragmentation look now? Pretty much the same at 99.9511% (are you surprised that the fragmentation measure improved slightly when all that happened was delete of every-other row??).



Now, *almost everyone in the world* will tell you that after a certain level of fragmentation you should go with an index rebuild rather than an index reorganization.  Right?  But... I'm the guy that *sometimes* rebuilds indexes at 0.44% fragmentation (see http://sql-sasquatch.blogspot.com/2016/02/sql-server-who-on-earth-would-rebuild.html).
 
What do you think the outcome was?
Let's check.


Cool!  From 99.9511% fragmentation to 0.6853% fragmentation!  Not bad for a reorg rather than a rebuild! :-)

As I built the table, rows were inserted into the clustered index with keys decreasing in numeric order.  Traveling the clustered index in increasing key order, the "next data page" after each data page almost certainly has a lower page ID.  That's the cause of the high reported fragmentation percent in the original table, and in the resulting table after delete of all odd-numbered rows.

The index reorg moves many rows into the empty pages (each odd-numbered row delete resulted in an empty page).  But if all the reorg did was smush together the reverse-ordered rows/pages, the resulting reported fragmentation at the end would still be high.

The index reorg ALSO reordered pages as it was able.

Before the index reorg, every data page had a page ID that was HIGHER than its "next page ID", due to reverse-ordered inserts.  After the rebuild, with 2043 data pages in the clustered index...


  After the index reorg there are only 3 data pages left that have page IDs higher than their next page page ID (unnatural order, if you will).

So, two takeaways:
  • reorg can re-order data pages into a better logical order in addition to making the index more compact
  • the general guidelines about "reorg in this range of reported fragmentation, rebuild above this level of reported fragmentation, ignore otherwise" don't tell the whole story


 I'm going somewhere with this, y'all, I promise.  It's just gonna take me a while to get there...