Wednesday, August 24, 2016

Windows Guest: VMware Perfmon metrics intermittently missing?

I'm using logman to collect Windows, SQL Server, and VMware metrics in a csv file(30 second intervals).

Not sure what's causing the sawtooth pattern below.  I thought it was a timekeeping problem on the ESXi host, now not so sure.

See this Ryan Ries blog post for what may be a similar issue, caused by clock sync of guest with ESXi host.
https://www.myotherpcisacloud.com/post/Mystery-of-the-Performance-Counters-with-Negative-Denominators!
But that is also a reversal of the problem somewhat.  In that case, guest metrics were returning -1.  In my case, its the VMware metrics (passed through from the host) that are missing.

The same pattern for "Host processor speed in MHz" adds to the mystery.
If only "Effective VM Speed in MHz" was missing, I'd chalk it up to a possible arithmetic issue, with a negative denominator resulting from time skew between guest and host.
But... "Host processor speed in MHz" should be a constant 2600 in this case.  Maybe somehow its still calculated with a time interval, and time skew can screw it up?

For now, I've got a loop running on this VM logging local time, and using NET TIME to retrieve time from another VM as well.  That's turned up variations, but it seems to be variations of up to 5 seconds in contacting the other VM rather than large time skew between the VMs.

Guess I'll see what turns up...

*****
So far this is the closest I've found.  Similar problem reported - no resolution.

"Windows VM perfmon counters - VM Processor"
https://communities.vmware.com/thread/522281

Thursday, August 4, 2016

Modeling SQL Server Behavior for Performance Intervention: Feed the CPUs

Before recommending any changes on a SQL Server system, I like to collect a number of days worth of perfmon data.  -- I also like to ask if there are particular windows of time or particular queries that are performance concerns.  That can help prevent me from spending lots of time on time windows or queries that are considered low value :-) But that's a story for another day.

When reviewing perfmon data with the intent of finding system optimizations, I like to say:
"It helps to have expected models of behavior, and a theory of performance."

I don't say that just to sound impressive - honest (although if people tell me it sounds impressive I'll probably say it more frequently).  There are a lot of thoughts behind that, I'll try to unpack them over time.

When I work with SQL Server batch-controlled workflows, I use the theory "feed the CPUs".  That's the simplest positive adaptation I could come up with of Kevin Closson's paradigm "Everything is a CPU problem" :-)

What I mean by "Feed the CPUs" is that memory and disk response times are primary factors determining the maximum rate for the CPUs to process the data.  Nuts & bolts of such a model for SQL Server are slightly different than a similar model for Oracle.  SQL Server access to persistent data is always through database cache, while Oracle uses shared access to database cache in SGA and private access to persistent data through direct access in PGA.

Here's where I'll throw in a favorite quote.

*****
"Essentiallyall models are wrongbut some are useful."
Box, George E. P.; Norman R. Draper (1987)
Empirical Model-Building and Response Surfaces, p. 424
Wiley ISBN 0471810339
*****

The workflows I work with most often are quite heavy on disk access.  The direction of change of CPU utilization over time will usually match the direction of change in read bytes per second.  So if you want to improve performance - ie shorten the elapsed time of a batch workload on a given system - Feed the CPUs.  Increase average disk read bytes per second over the time window.

That's a very basic model.  It ignores concepts like compression.  In my case, all tables and indexes should be page compressed.  If not I'll recommend it :)  It kinda sidesteps portions of the workload that may be 100% database cache hit.  (Although I consider that included in the "usually" equivocation.)  It definitely doesn't account for portions of the workload where sorting or other intensive work (like string manipulation) dominates CPU utilization.  Still, a simple model can be very useful.

Here's a 2 hour time window from a server doing some batch ETL processing.  You can kinda see how CPU utilization trends upward as read bytes per second trends upward, right?  I've highlighted two areas with boxes to take a closer look.



Let's look at the 8:15 to 8:30 am window first.  By zooming in on just that window & changing the scale of the CPU utilization axis, the relationship between CPU utilization and read bytes/second is much easier to see.  That relationship - direction of change - is the same as other parts of the 2 hour window where the scale is different.  Depending on the complexity of the work (number of filters, calculations to perform on the data, etc) with the data, the amount of CPU used per unit of data will vary.  But the relationship of directionality* still holds.  I'm probably misusing the term 'directionality' but it feels right in these discussions :-)



All right.  Now what about the earlier time window?  I've zoomed in a bit closer and its 7:02 to 7:07 that's really interesting.



Assuming that SQL Server is the only significant CPU consumer, and assuming that that whole 5 minute time isn't dominated by spinlocks (which at ~40% CPU utilization doesn't seem too likely), the most likely explanation is simple: 5 minutes of 100% cache hit workload, no physical reads needed.  That can be supported by looking at perfmon counters for page lookups/second (roughly logical IO against the database cache) and the difference between granted memory and reserved memory (which is equal to the amount of used sort/hash memory at that time).

OK... well, that's all the time I want to spend justifying the model for now :-)


Here's the original graph again, minus the yellow boxes.  There's plenty of CPU available most of the time.  (No end users to worry about on the server at that time so we can push it to ~90% CPU utilization comfortably.) Plenty of bandwidth available.  If we increase average read bytes/sec we can increase average CPU utilization.  Assuming that the management overhead is insignificant (spinlocks, increased context switching among busy threads, etc), increasing average CPU utilization will result in a corresponding decrease in elapsed time for the work.



All right.  So next we can investigate how we'll increase read bytes per second on this system.

Ciao for now!








Tuesday, August 2, 2016

SSAS Multidimensional Cube Processing - Shrinking the unfamiliar window

Last week I posted some of my perfmon graphs from an SSAS server.  I want to model the work happening on an SSAS server during multidimensional cubes processing - both dimension and fact processing.

That's here:
SSAS Multidimensional Cube Processing - What Kind of Work is Happening?
http://sql-sasquatch.blogspot.com/2016/07/ssas-multidimensional-cube-processing.html

There was a window at the end of the observation period where data was still flowing across the network, CPU utilization was still high enough to indicate activity, but the counters for rows read/converted/written/created per second were all zero.  The index rows/sec counter was also zero.



A colleague read my post, looked at my data... and pointed out:
"One of your graphs shows what was happening there!"

Sure enough, he was right.  Even though I time-shifted my graphs to somewhat obscure the data, he noticed what activity filled the gaps.  I'm gonna hafta work hard to make sure these guys don't lap me soon. :-)

  
In-memory file work, mainly for dimensions. Well, that shrinks the mystery window for me :-)

Let's look at the very tail end of the observation.

At the very tail end of the observation period, data stops coming over the network.  But CPU utilization increases. And CPU utilization drops off as "In-Memory Other File KB" drops off.

So this "other file" was somewhat expensive to handle, even though only for a limited time.

Wonder what that file was, since it wasn't a Dimension Index, Property, or String file?

By now you're probably wondering - why don't you just use profiler already? :-)

I'll get there - soon, probably.  But I'm trying to lay the groundwork for understanding system activity independent of knowing the individual cubes being processed.  My colleagues and I will likely be reviewing many perfmon captures across many cubes servers.  Some of the captures will almost certainly be started after cubes processing has begun.

So I'm hoping to get as far as possible from just the perfmon files, before returning for another source of information.

But certainly - being able to tie out system activity to process step beginning and end from profiler is preferable if in a position to influence cube design and development from the front end :-)   




Friday, July 29, 2016

SSAS Multidimensional Cube Processing - What Kind of Work is Happening?

**Update**
I added a post on August 2, 2016 - answering some questions and leading to others :-)
SSAS Multidimensional Cube Processing - Shrinking the unfamiliar window
http://sql-sasquatch.blogspot.com/2016/08/ssas-multidimensional-cube-processing.html
****
 
In order to evaluate SQL Server systems, I've worked perfmon data over and over until I was satisfied that I could tell a high level story of what work was being done(rows scanned/sec,rows returned/sec, rows written/sec, log flush bytes/sec) and who was doing that work(active requests, active parallel threads).

I'm starting to work through a similar process for SSAS MultiDimensional Cubes - processing first.

Some graphs from perfmon.

Because Analytics Services is on a separate VM from the RDBMS server, I know that all of the incoming data to be worked on is coming over the wire.

So what kind of relationship is there between network bytes received and CPU utilization?


That trends pretty well.  I've put a box around an interesting area. In that area, bytes received per second is about the same as earlier periods of time.  But CPU is quite a bit lower.
What's being done with the data that is coming in?

Well, that makes sense.  Rows are read, converted, and written.  Looks like writing rows is timer-based, or maybe threshold-based?

But the unusual timeperiod is still unusual.  Rows are read, but not converted or written.  Hmmm.

OK, lets take a look at rows read/converted/written against CPU utilization.


OK.  CPU utilization tracks pretty will with rows read/converted outside of the mystery range following 9:30.
But now 3 additional interesting timeperiods are evident, in the red boxes.  CPU utilization is much higher in the red boxes than would be predicted by rows read/converted.

What's the CPU doing at about 8:05, 8:45, 9:30? (Not forgetting the original question about 9:40 to about 10:15.)

Maybe there's other kinds of work I have included in these graphs?

How about index rows/sec?


Aha!  OK.  So index rows/sec is another part of the picture, and helps to explain some of the CPU utilization not explained by rows read/converted/written.  But it still doesn't explain the tail end of the graph.

I also noticed that there is a counter for rows created/sec.


All right. Got some rows created at 8:05 am, and 8:45 am.  None at 9:30.  And none in the final mystery period.

So the work of read/convert/create/write rows and index rows is a pretty good profile of the CPU work done up until about 9:40 am.

What type of threads were doing the work? What about the non I/O threads in the processing pool.


Wow.  Very weak relationship between the number of these threads and CPU utilization.  These processing pool non-I/O threads may be assigned to tasks, and in that sense "busy".  But in this case, it sure seems like they are mostly busy waiting.

There are a couple more thread types to consider.


So the I/O threads match well to the index rows and created rows.  There's only 1 busy query pool thread over the whole time. 

The processing pool non I/O threads must be doing most if not ALL of the read/convert/write work.  But most of these threads must be spending most of their time waiting, since the number of "busy threads" has almost no correlation to CPU utilization.

But I still don't know what was going on in that mystery period.  That'll have to be for another day...










Friday, June 24, 2016

SQL Server trace flag 8721 - auto stats sync & async update info

Aaron Morelli (@sqlcrossjoin on Twitter) has a great collection of trace flags in the pdf linked below.  (Check to see if there's a more recent update, v6 was current as of April 2016.)
https://sqlcrossjoin.files.wordpress.com/2016/04/sqlcrossjoin_traceflagrepository_v6.pdf

Trace flag 8721 is of interest to me when investigating ETL behavior.  A complex ETL could change stats numerous times for lots of stats keys.  Information for auto stats updates in addition to stats updates performed explicitly by ETL code can really help complete the picture of what happens during an investigation.

I found this post from Prince Rastogi where he walks through a demo of trace flag 8721 by creating a new table with a stats key, increasing the rows from 0 to 1, then executing a query that requests the stats.  Here's an important note: use trace flag 3605 (redirect some output to SQL Server log) together with trace flag 8721.
http://www.sqlservergeeks.com/sql-server-trace-flag-8721/

I had two questions, and answered them both by following the same process.
My first question: Does it still work in SQL Server 2016? Yep.
Second question: Does it capture both async & sync auto stats updates? Yep.

Snippet from the SQL Server error log below.

*****
Date,Source,Severity,Message
06/24/2016 12:16:22,spid23s,Unknown,AUTOSTATS: UPDATED Stats: test_Tflag8721..PK__test_Tfl__1B434632D45E0816 Dbid = 7 Indid = 1 Rows: 1 Duration: 0ms
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:22,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2089019169 Rows: 1.000000 Threshold: 500 Duration: 0ms **** >>>> ASYNC <<<< ****
06/24/2016 12:16:11,spid57,Unknown,The error log has been reinitialized. See the previous log for older entries.
06/24/2016 12:16:11,spid57,Unknown,Default collation: SQL_Latin1_General_CP1_CI_AS (us_english 1033)
06/24/2016 12:16:11,spid57,Unknown,The service account is 'NT Service\MSSQLSERVER'. This is an informational message; no user action is required.
06/24/2016 12:16:11,spid57,Unknown,Logging SQL Server messages in file 'M:\MSSQL\Log\ERRORLOG'.
06/24/2016 12:16:11,spid57,Unknown,Authentication mode is MIXED.
06/24/2016 12:16:11,spid57,Unknown,System Manufacturer: 'VMware<c/> Inc.'<c/> System Model: 'VMware Virtual Platform'.
06/24/2016 12:16:11,spid57,Unknown,Server process ID is 1488.
06/24/2016 12:16:11,spid57,Unknown,All rights reserved.
06/24/2016 12:16:11,spid57,Unknown,(c) Microsoft Corporation.
06/24/2016 12:16:11,spid57,Unknown,UTC adjustment: -5:00
06/24/2016 12:16:11,spid57,Unknown,Microsoft SQL Server 2016 (CTP3.0) - 13.0.700.242 (X64) <nl/>    Oct 26 2015 18:46:06 <nl/>    Copyright (c) Microsoft Corporation<nl/>    Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)
06/24/2016 12:16:11,spid57,Unknown,Attempting to cycle error log. This is an informational message only; no user action is required.
06/24/2016 12:15:52,spid57,Unknown,Setting database option AUTO_UPDATE_STATISTICS_ASYNC to ON for database 'SEASIDE'.
06/24/2016 12:12:56,spid57,Unknown,AUTOSTATS: Tbl: test_Tflag8721 Objid:2057019055 Rows: 1.000000 Threshold: 500 Duration: 2ms
06/24/2016 12:12:56,spid57,Unknown,AUTOSTATS: UPDATED Stats: test_Tflag8721..PK__test_Tfl__1B434632ECB10C7B Dbid = 7 Indid = 1 Rows: 1 Duration: 0ms
06/24/2016 12:12:50,spid57,Unknown,The error log has been reinitialized. See the previous log for older entries.
06/24/2016 12:12:50,spid57,Unknown,Default collation: SQL_Latin1_General_CP1_CI_AS (us_english 1033)
06/24/2016 12:12:50,spid57,Unknown,The service account is 'NT Service\MSSQLSERVER'. This is an informational message; no user action is required.
06/24/2016 12:12:50,spid57,Unknown,Logging SQL Server messages in file 'M:\MSSQL\Log\ERRORLOG'.
06/24/2016 12:12:50,spid57,Unknown,Authentication mode is MIXED.
06/24/2016 12:12:50,spid57,Unknown,System Manufacturer: 'VMware<c/> Inc.'<c/> System Model: 'VMware Virtual Platform'.
06/24/2016 12:12:50,spid57,Unknown,Server process ID is 1488.
06/24/2016 12:12:50,spid57,Unknown,All rights reserved.
06/24/2016 12:12:50,spid57,Unknown,(c) Microsoft Corporation.
06/24/2016 12:12:50,spid57,Unknown,UTC adjustment: -5:00
06/24/2016 12:12:50,spid57,Unknown,Microsoft SQL Server 2016 (CTP3.0) - 13.0.700.242 (X64) <nl/>    Oct 26 2015 18:46:06 <nl/>    Copyright (c) Microsoft Corporation<nl/>    Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)
*****

Tuesday, June 14, 2016

SSIS "buffers spooled" on a NUMA server


 So this happened today during an ETL.



A quick look through the InterTubes tells me that SSIS should spool buffers to disk only if
  • memory allocation failed
  • low memory resource notification sent by Windows OS

So the server had plenty of free RAM.  But NUMA node 1 was in a pinch.  And SSIS spooled its buffers to disk.  Doggone it.

I guess I'd figured that notifications were sent based on server-wide memory state.  But I guess maybe memory state on each NUMA node can lead to a memory notification?

The target SQL Server instance, a beefy one, was also on this physical server.  There's 1.5 TB of RAM on the server.  :-)

We lowered max server memory by 100GB, hoping to get a reasonable amount of free memory on NUMA node 1 for tomorrow's ETL.  (The server was also rebooted for an unrelated reason.)

I dunno though.  This is SQL Server 2014 SP1 CU2.   

Memory is already being consumed unevenly across the NUMA nodes.  If NUMA node 1 gets short end of the stick by 6:00 am tomorrow, I think for now we may try to affinitize SSIS to NUMA node 2, in hopes that its memory allocations will come from that node and it that memory allocations will continue to allow enough free memory there even while short-changing NUMA node 1.  

*****Update*****
A blog post showing use of Windows Resource Manager to affinitize SSIS.  But... this feature was deprecated in Windows Server 2012 R2. 
:-( Sad trombone.  

Maybe next version of SSIS can support affinity natively, similar to SQL Server database engine?  Or maybe SSIS can be made NUMA aware?  Or...

Well, at any rate, for maximum scalability and flexibility, gotta find some way of making this work...




Monday, June 13, 2016

L'esprit de L'escalier - Whither SQL Server Backup Buffers?

I've got a long, long list of interesting questions.  And I blog frightfully slowly :-)

So... why don't I start blogging more of the questions?  Someone else may very well be able to springboard from one of my questions to a *really* astounding answer!  Maybe.  I hope my questions are that good anyway :-)

Sometimes the questions come to me as a form of L'esprit de L'escalier - not something I wish I'd said but a question I wish I had asked.

So it is with my question: "Whither SQL Server Backup Buffers?"

Yesterday (or maybe really early this morning) I posed the question: "Whence SQL Server Backup Buffers?"

Here's what I noticed: 'Target server memory' 33553968 hadn't been achieved.  'Total server memory' during the backup was 14947794.  Of that 4194496 was allocated to memoryclerk_backup.

When the backup was cancelled, the memoryclerk_backup allocation was simply returned to the OS.

Why not give it to free memory?  It should have been nice, contiguous chunks of memory addresses.  The server was far from reaching target server memory.  The server wasn't under memory pressure.  This is what it looks like right now... and it would have looked nearly the same right after the backup was cancelled.
That's a lot of free memory.  Take away 4 GB from it for the backup buffers... yeah, its still a lot of free memory.

I wonder:
-If 'max server memory' wasn't being overridden by a more reasonable target (because max server memory is 120 GB on a 32GB VM), would the behavior still be the same before reaching target?  I bet it would be.
-Is this behavior specific to not having reached target?  Or when reaching target would backup buffers be allocated, potentially triggering a shrink of the bpool, then returned to the OS afterward requiring the bpool to grow again?
-What other allocations are returned directly to the OS rather than given back to the memory manager to add to free memory?  I bet CLR does this, too.  Really large query plans?  Nah, I bet they go back into the memory manager's free memory.
-Does this make a big deal?  I bet it could.  Especially if a system is prone to develop persistent foreign memory among its NUMA nodes.  In most cases, it probably wouldn't matter.

Maybe I'll get around to testing some of this.  But it probably won't be for a while.