Wednesday, August 5, 2015

SQL Server Workspace Memory - with a Twist

In my recent blog post 'SQL Server Granted/Reserved/Stolen Workspace Memory in Perfmon' I hinted at a twist in the story of perfmon accounting of workspace memory.

Here's the stolen/free/database cache account of memory from perfmon on 'system B'.  Its a big memory system - a terabyte of RAM.


 At times more than 250 GB of RAM is stolen memory.  Wonder how much of that is used workspace memory?  How about inserting an Excel column for the difference between 'SQLServer:Memory Manager\Granted Workspace Memory (KB)' and 'SQLServer:Memory Manager\Reserved Server Memory (KB)'?

Whoa... reserved memory should be a subset of granted memory?!?

The amount of granted memory is so small I can't even see it on the same scale as the reserved memory. Adjust the scale and I can see that 'SQLServer:Memory Manager\Granted Workspace Memory (KB)' rarely has much memory accounted at all.

Well, I know that there are multiple Resource Governor workload groups on this server.  Here's a graph of their CPU activity during this observation time.

WrkGrp_4 is the main source of activity during the observation period. I wonder how these workgroups are distributed among Resource Governor resource pools?  Each resource pool added to the RG results in a new set of memory brokers added to the system, including a new memory broker to account for memory grants within the pool.

Sure - I could show the RG configuration but I'm a charts-n-graphs kinda guy :-)  Comparing this graph of Resource Pool CPU utilization to the previous graph, its clear each of the 4 workload groups is associated with its own resource pool.

With additional resource pools and their brokers in the mix,  the accounting of  'SQLServer:Memory Manager\Granted Workspace Memory (KB)' and 'SQLServer:Memory Manager\Reserved Server Memory (KB)' is expected to change.

We can see that 'SQLServer:Memory Manager\Granted Workspace Memory (KB)' only reports the granted query memory for the default resource pool.

With that being the case,  consistency would predict that 'SQLServer:Memory Manager\Reserved Server Memory (KB)' only report the unused portion of memory grants within the default resource pool.  If that were the case, 'SQLServer:Memory Manager\Reserved Server Memory (KB)' values would always be less than 'SQLServer:Memory Manager\Granted Workspace Memory (KB)' - we've already seen that's not the case.

Below is a transparent red overlay of  'SQLServer:Memory Manager\Reserved Server Memory (KB)' on top of a stacked graph of 'Active memory grant amount (KB)' for each of the resource pools.
That's more like it! 

So there's a fundamental difference between perfmon values for 'SQLServer:Memory Manager\Granted Workspace Memory (KB)' and 'SQLServer:Memory Manager\Reserved Server Memory (KB)': the granted memory values are ONLY for the default resource pool, while the reserved memory values are an aggregate across all resource pools.  Without that important tidbit, memory accounting can get pretty weird if there are active added resource pools to the Resource Governor configuration.

With that piece of knowledge, I add a column to Excel that sums the active memory grant amount for each of the resource pools, then subtract 'SQLServer:Memory Manager\Reserved Server Memory (KB)'.  The result is the total stolen/used workspace memory across all resource pools.  I wish perfmon had a counter for reserved memory per resource pool to accompany its granted per pool counter.  Alas - if you want that you're gonna hafta reap it from sys.dm_exec_query_memory_grants or sys.dm_exec_query_resource_semaphores.

Anyway - with my new computed column I can see that stolen/used query memory is that largest contributor to stolen memory during the observation, and based on the counters for granted per pool I know that the vast majority of the used/stolen query memory during the observation was against grants for ResPool_4.

There are three big reasons I like to track used query memory, especially in the context of Resource Governor:
1. Often query concurrency is limited by the amount of granted memory - but used query memory may be a small fraction of granted throughout.  If there's CPU to spare, using Resource Governor to limit the maximum query grant size can increase system performance by decreasing resource_semaphore waits and increasing query concurrency.
2. Sorting and hashing activity in workspace memory is not counted in the perfmon 'buffer page lookups' counter.  So when there is massive amounts of sorting, it can bend the relationship of CPU utilization to logical IO on the system.  Important to know when modeling system behavior.
3. If there are multiple resource pools active at the same time (other than the pre-existing default and internal pools), the system is vulnerable to out-of-memory exceptions when granted workspace memory becomes overcommitted and non-workspace memory steal/allocations fail.

But those stories will have to wait for yet another day... 

Once you graph the used query memory on a given system, if you see that its just a fraction of granted memory and are experiencing pending memory grants/resource_semaphore waits - you may want to use Resource Governor workload group max query memory grant to increase query concurrency.  An older blog post where I address that.

Resource Governor to restrict Max Query Memory Grant


No comments:

Post a Comment