Tuesday, July 8, 2014

Before I drop all those SQL Server auto-generated column stats....

***** Update 20180628 *****
I'm leaving this post for historical reasons.  But if you want to parse plan XML for optimizer stats detail included due to trace flag 8666, please don't use the query included here.  Its ugly, has unintended cartesian products which the DISTINCT attempts to mitigate, and performs poorly.

Instead, use something more like the query at this new blog post:

Harvesting SQL Server Trace Flag 8666 stats from Query Plan XML

http://sql-sasquatch.blogspot.com/2018/06/harvesting-sql-server-trace-flag-8666.html

*****


***** Update 20170104 *****
Just verified that this still works on post-2016 vNext 1.1 :-)
*****

A while back I ran across an intriguing suggestion to simply drop all auto-created columns statistics in a database, by Ami Levin.  Read the blog post here:
http://blogs.msdn.com/b/mvpawardprogram/archive/2013/09/09/sql-server-auto-statistics-cleanup.aspx
 
I wouldn't want to drop all column statistics without an idea of which are being used and which are neglected.

A colleague asked me about evaluating the statistics in use on a running system.  At first all I remembered was trace flags 9204 & 9292 (see trace flag 2363 for SQL Server 2014 and beyond).  Paul White blogged about these here:
http://sqlblog.com/blogs/paul_white/archive/2011/09/21/how-to-find-the-statistics-used-to-compile-an-execution-plan.aspx

That allows investigating an individual query (an activity I always encourage for developers and report writers)... but not helpful for my context.

But in the comments to Paul's blog post Fabiano Neves Amorim showed up!  I read this blog post about a year ago... but without an immediate need it just kind of filed away in my mind somewhere near the plots and plot holes of old James Bond films I watched.  This time I paid attention :-)
http://blogfabiano.com/2012/07/03/statistics-used-in-a-cached-query-plan/

So, trace flag 8666 seems to be the key for what I want to do.  It allows peeking into 'InternalInfo' in the plan XML, in order to get table names (wszTable) and stat names (wszStatName) that were used to compile plans.  Cool!  So here's the disclaimer:

Trace flag 8666 is NOT an officially documented SQL Server trace flag.  I do not recommend using this trace flag without a tight leash!  In fact, any querying of the plan cache can become quite intensive - plans can be big and pulling lots of them into RAM can flush other database contents out while other folks are trying to get work done!  I know cuz I've caused trouble for folks - so do better than me!  Also, there's absolutely no guarantee that trace flag 8666 will continue to allow this insight in the future.


But, here's a new toy of mine, for profiling stats use on performance test servers when a full batch load of queries is loaded up.

There's no doubt in my mind that some folks reading this are WAY better at handling xquery or XML in general than I am.  If there are suggestions to improve the performance of this thing, I'm all ears...

But... here's my toy... in an early incarnation, anyways...
Note that in order to somewhat limit scope, only plans in the database of execution are considered based on sys.dm_exec_plan_attributes.

********************************************************************************************************

/* do NOT use in production unless CPU and memory utilization are well understood */
DBCC TRACEON(8666) with no_infomsgs;
;WITH 
   XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as p),
   handles (usecounts, plan_handle)
   AS
   (SELECT
       usecounts,
    plan_handle
    FROM 
       sys.dm_exec_cached_plans AS cp
    OUTER APPLY
       sys.dm_exec_plan_attributes(plan_handle) AS epa
    WHERE 
       cp.cacheobjtype = 'Compiled Plan'
       AND
       epa.attribute='dbid'
       AND
       epa.value=DB_ID())
SELECT DISTINCT
   TableUsed.XMLCol3.value('@FieldValue','NVarChar(500)') AS TableName
   ,StatsUsed.XMLCol5.value('@FieldValue','NVarchar(500)') AS StatsName
   ,UpdateThreshold.XMLCol6.value('@FieldValue','NVarchar(500)') AS ullThreshold
   ,handles.usecounts
   ,handles.plan_handle
FROM
   handles AS handles
CROSS APPLY
   sys.dm_exec_query_plan(handles.plan_handle) AS qp
CROSS APPLY
   qp.query_plan.nodes('//p:InternalInfo') AS InternalInfo(XMLCol1)
CROSS APPLY
   InternalInfo.XMLCol1.nodes('p:EnvColl/p:Recompile') AS Recomp(XMLCol2)
CROSS APPLY
   Recomp.XMLCol2.nodes('p:Field[@FieldName="wszTable"]') AS TableUsed(XMLCol3)
CROSS APPLY
   Recomp.XMLCol2.nodes('p:ModTrackingInfo') AS Tinfo(XMLCol4)
CROSS APPLY
   Tinfo.XMLCol4.nodes('p:Field[@FieldName="wszStatName"]') AS StatsUsed(XMLCol5)
CROSS APPLY
   Tinfo.XMLCol4.nodes('p:Field[@FieldName="ullThreshold"]') AS UpdateThreshold(XMLCol6)
WHERE
   qp.query_plan.exist('//p:InternalInfo') = 1
   AND
   InternalInfo.XMLCol1.exist('p:EnvColl/p:Recompile') = 1
   AND
   Recomp.XMLCol2.exist('p:Field[@FieldName="wszTable"]') = 1
   AND
   Recomp.XMLCol2.exist('p:ModTrackingInfo') = 1
   AND
   Tinfo.XMLCol4.exist('p:Field[@FieldName="wszStatName"]') = 1
   AND
   TableUsed.XMLCol3.value('@FieldValue','NVarChar(500)') NOT LIKE 'sys%'
ORDER BY
   handles.usecounts DESC, handles.plan_handle DESC;
DBCC TRACEOFF(8666) with no_infomsgs;



No comments:

Post a Comment