Thursday, April 25, 2013

Why is SQL Server PLE plummeting? Should I care?

Because the SQL Server page life expectancy is a computed value, it should earn your trust for various uses.  In fact, on NUMA servers with SQL Server NUMA support enabled (as it is by default), its not just a computed value, its an aggregation of computed values across independently managed database buffer pool cache segments, one per NUMA node.  There is some type of weighting involved in the calculation of the "general" PLE from multiple PLEs, as well.  I've looked at the numbers just enough to decide its not worth my time right now to determine how the node PLEs are weighted in figuring the general PLE.

For all of its warts, PLE is still valuable.  Within the activity profile of a given system, its a way to compare database buffer cache churn under different workloads.  Assuming the same system and the same workload, faster buffer churn typically correlates to a higher physical read load.

So - assuming the same or similar workload on a given system, a plummeting PLE can be a concern - it can drive up the physical read load.  Increasing the read load won't necessarily increase execution time, due to the high reliance of SQL Server on prefetch and readahead, which I believe was quite well implemented.  However, driving up the read load for the same workload makes the system a less friendly neighbor on shared resources such as SAN.  Even if the physical SAN disks for the SQL Server system are isolated from the disks for other systems, its possible for the increased read load to saturate front end director bandwidth (I always think of this as the link from host to SAN cache, because I believe cache is king), front end director CPU, front end director port queue depth, back end director CPU, back end bus bandwidth, and FCAL loop bandwidth if not lucky enough to be using back end SAS connections instead of arbitrated loops.  Increased read load from SQL Server can also lead to increased SAN read cache churn if SAN read cache is shared with other hosts.  SQL Sever systems generally don't care about SAN read cache much... but the other systems might be severely inconvenienced by having their contents churned out of read cache.  Enough blabbering - trust me - doing more work on the SAN than necessary is bad for other SAN consumers, and there are lots of places that badness can materialize.  If the SQL Server system is a guest on a shared virtual host, there are shared host resources that could be overburdened as well, such as the physical HBA.

Don't freak out based on PLE alone, but don't ignore the PLE either.  Please.

Are physical read operations rising while PLE is plummeting?  Does physical read bandwidth rise while PLE is plummeting?  Those are some of the other items in perfmon that I look at when I see irregularities in PLE.  And there's another condition to consider...

"Silly sasquatch - if PLE is dropping, reads MUST be increasing!  How could it be any other way?? Maybe you should go back to cobol..."

Not so fast, young warrior.  First of all, let's assume a constant database cache size.  If SQL Server is as well-coded as the databases of my youth - now considered antiques at over 30 years old - then the first write to a newly allocated user database block (at the very least, a new user database block in a new extent) should not require a physical read before written to the buffer cache, and flushed later as a write to disk.  That means... high write activity in a constant-size database cache will lead to lower PLE not solely based on reads of database blocks into the buffer cache.  Spreadsheets or graphs indicating that are left, for now, as a reader exercise.  :)

But, there's another potentially greater factor that can push down PLE, other than physical reads and logical writes: the expansion of "stolen memory" reported by perfmon.

Its important to distinguish between "stolen memory" and "granted workspace memory".  Sometimes I see a description of high "granted workspace memory", and the potentially resulting pending memory grants, as "memory pressure".  I don't think that's an entirely accurate description.  Its really something more like "promised memory pressure".  You may not believe me - you are free to try to get the numbers in perfmon to add up to something reasonable.  Or you can trust me :)  The numbers reported by perfmon of database pages, stolen pages, and free pages add up to... total pages.  You won't be able to fit granted workspace memory in there - I've tried.  Now, granted workspace memory - reserved memory should be equal or less than  "stolen memory".  That's because "stolen memory" is "active query memory", plus some other stuff.  I'll return to that in a later post.  I promise.

Anyway... assume a system that has already achieved the "max server memory" target for the buffer pool.  Assume 5 long-running queries on the system, each with query memory grants (which can be monitored with sys.dm_exec_query_memory_grants).  At the start of those queries, they may have very little use for the query memory they've been granted.  Why zero it?  Why not let the database pages cached in those buffers stay there, until the intermediate and final resultsets of the query actually need the buffers to be stored?  In fact - some queries request way, way more query memory than they will use.  Zeroing the contents of the granted query memory before the query workers are ready to fill them would needlessly age database pages out of the database cache.

But... if those queries have a good fit between their memory grant requests and the memory they'll actually use... over time they will steal memory from the database cache... up to their memory grant... and the database cache page count gets lower.  With a lower database page count, and stable rates of page insertion and retirement, the page life expectancy is lowered.  Some folks like to call the PLE concept "buffer cache hold time" - I actually think that sounds cooler, but what do I know?

Anyway... next time you look at a system and see wild changes in PLE - whether global PLE or per NUMA node PLE... consider the pace of data pages into the buffer, the pace of data page first write in the buffer, and the pace of "stolen memory" stealing as potential causes for rapid changes in PLE.  It might save you from a panic attack.

On the other hand... if you eliminate those conditions as factors in drastically changing PLE...

****  update by sql_sasquatch 26 September 2013****
This thread is good discussion of PLE issues specific to early builds of SQL Server 2012
http://bit.ly/19z5GTj 

That thread discusses these SQL Server 2012 two hotfixes.

kb2819662 SQL Server performance issues in NUMA environments
http://bit.ly/Z9C25x

kb2845380 You may experience performance issues in SQL Server 2012 Edit
http://bit.ly/13LADVq

No comments:

Post a Comment