Friday, October 28, 2016

SQL Server Memory Accounting with Perfmon

After learning to account for database memory accounting with DMVs & DMFs, the numbers from perfmon can be downright confusing.  Lack of alignment with the DMV/DMF values notwithstanding* I believe using perfmon to profile SQL Server resource utilization and response times can be extremely valuable.  For me, its an indespensible tool.  Using a perfmon counters file and a bat script with a logman command, I can easily distribute a tool to collect metrics from a physical server or vm running SQL Server in 30 second intervals.  I wouldn't want to try to collect the same information in 30 second or even 1 minute intervals with SQL Agent or service broker.  Anyway...

Let's look at the total of SQL Server shared memory for SQL Server 2012 and beyond.  As I've mentioned before, the shared memory can be fully accounted for with database cache, stolen, and free categories.

By zooming in on the 108 GB to 124 GB range we can see that the sum of the three categories is equal to "Total Server Memory".

Lets look at stolen memory a bit.  The relationship between memory grants and stolen memory is probably the least intuitive relationship.  Remember - if a query gets a memory grant the grant happens at the beginning of query execution.  Its just a promise of sort/hash memory to be made available when the query needs it.  The grant memory isn't stolen immediately - rather its stolen in small allocations over time as needed by the query.

In the graph immediately below, the outstanding grants are shown over time.  There are no pending grants during the observation period.  Granted memory and reserved memory are both shown as areas, with reserved memory in front of granted memory.  Granted memory is consistently greater than reserved memory (in this case, no resource pools have been added beyond the pre-existing default and internal pools).  This is how we can determine that the reserved memory is granted memory which hasn't been stolen yet.

Here is the difference between granted memory and reserved memory, graphed as single measure area graph below, with the number of outstanding grants in a line graph.

Now lets stack several memory components in an area graph in front of stolen memory.  Let's put in several perfmon metrics directly as well as the difference between granted and reserved memory.

Notice that plan cache pages increase slowly throughout the capture period.  The amount of query memory used (granted kb - reserved kb) is more variable, and together with the other memory metrics selected trends very well with total stolen memory.

There's definitely at least 1 significant stolen memory consumer since up to 5 GB of stolen memory is currently unaccounted for.

Main things I wanted to show here:
-used query memory is accounted for in perfmon stolen memory
-although plan cache bloat is a fairly common topic, some workloads exeperience much more pressure from workspace memory than from the plan cache.

*as well as some other issues with perfmon that I'll detail at a later date

1 comment:

  1. This is good information. Thank you for sharing. However, I'm not very familiar with Perfmon. Is the a tutorial you recommend for setting up the counters file and batch script?