Thursday, November 19, 2015

SQL Server gives 110%!! (And sometimes a little more...)

Perfmon "Resource Pool Stats" sometimes shows CPU utilization in excess of 100% - sometimes approaching 120%.
I can't explain it.  But I can eliminate an explanation.   This is a an older Intel X5675 (6 cores per socket, dual socket), due for refresh.

http://ark.intel.com/products/52577/Intel-Xeon-Processor-X5675-12M-Cache-3_06-GHz-6_40-GTs-Intel-QPI

I trust the utilization reported by "Processor Info".  Note that the greatest reported "Resource Pool Stats" utilization (approaching 120%) is when total "Processor Info" utilization is near 100% across all 12 physical/24 logical cores.  Nominal rating of the core is 3.06 GHz, top SpeedStep is 3.46 GHz.  That would give a maximum ratio of 3.46/3.06 = 113%, which is still under the number reported by SQL Server (for Default pool alone, I'll add).  Even if the numbers made it seem possible that SpeedStep was responsible for more than 100% utilization reported by SQL Server, I don't think SpeedStep is the culprit.  The older Intel processors were by default conservative with SpeedStep, to stay well within power and heat envelope.  And no-one's been souping this server up for overclocking :-)

So... if my database engine will give 110% (and sometimes more...) I guess I better, too.  :-)

Even though the CPU utilization reported by SQL Server Resource Pool Stats can't be strictly compared to Processor Info, it can still be valuable.  If there are other user Resource Pools, comparing CPU utilization among them can still be useful.  (On this particular server there is only Default and Internal.)  It is also very helpful to me to compare CPU utilization reported for the resource pool to number of active requests and number of active parallel workers in the resource pool.

When I see this graph...





In light of the low efficiency of CPU utilized in terms of lookups/sec (below - this relationship is of course not static across workloads but is meaningful when familiar with a particular workload), I suspect that either there are spinlocks to be mitigated with a trace flag (ie 8048, 9024, 1236) or the workload is overthreaded and would work more efficiently with a lower total number of parallel threads.


 

No comments:

Post a Comment