Tuesday, July 15, 2014

Hitachi VSP and Universal Replicator: Asynchronous replication is not free


Occasionally I'll hear the idea that asynchronous array-level replication does not effect primary system performance.  That's just not true.  It doesn't effect performance in the same way as synchronous replication does. But that's different from not affecting primary system performance at all*.

But why take my word from it?  EMC and Hitachi are both really good at array level replication.  I've already quoted from EMC documentation on my blog today, so now I'll switch to Hitachi documentation :-)

****************************************************************************
Hitachi Virtual Storage Platform
Hitachi Universal Replicator User Guide

www.hds.com/assets/pdf/vsp-hitachi-universal-replicator-user-guide.pdf

Performance is affected because of journal-to-journal copying. Usage rates are also affected.
Page 1-3 

****************************************************************************

*It is possible that a database with asynchronous writes and a light enough level of writes may see the effects of asynchronous - even synchronous - array replication without noticeably impeding query completion speed.  But I'm talking about pedal-to-the-metal data warehouse loads and the like :)

EMC VMAX Virtual Provisioning: Striped Metas now generally recommended




For some reason, I find it very easy to come across REV A12 of the EMC® Symmetrix®Virtual Provisioning™ documentation referenced below compared to REV A14.  Putting up this blog post more for myself than others... perhaps this way I'll easily be able to track down REV A14.  For me REV A14 is important because its when the general recommendation switched from concatenated metas to striped metas.  No doubt the performance benefits were the motivation, and EMC waited until enough operations on striped metas were possible to ease administrative burden of recommending striped metas.

I find the recommendation of striped metas especially important for Oracle redo logs and SQL Server transaction logs when SRDF synchronous replication is used.  This is because the serialization required for replication has less of a performance drag with striped metas than concatenated metas. Striped metas may also help when asynchronous replication is used, because of the write cache policies on the VMAX.

************************************************************************************
Best Practices for Fast, Simple Capacity Allocation with EMC® Symmetrix®Virtual Provisioning™


In most cases, EMC recommends using concatenated rather than striped metadevices with Virtual Provisioning.
Page 12

************************************************************************************
Best Practices for Fast, Simple Capacity Allocation with EMC® Symmetrix®Virtual Provisioning™
Document 7383 300-006-718 REV A14  (link requires EMC Account login)

In most cases, EMC recommends using striped rather than concatenated metadevices with Virtual Provisioning because they offer the following advantages:

  • With Synchronous SRDF®, Enginuity allows one outstanding write per thin device per path. With concatenated metadevices, this could cause a performance problem by limiting the concurrency of writes. 
    This limit will not affect striped metadevices in the same way because of the small size of the metavolume stripe (1 cylinder or 1920 blocks).
  • In Enginuity releases prior to 5875, the code allows eight read requests per path per thin device. This may limit the number of read requests that can be passed through to the thin pool regardless of the number of data devices it contains. This can cause slower performance in environments with a high read miss rate. A striped metadevice offers improved throughput, because each metamember is allowed to queue requests to the backend in parallel. Even with the limit removed in Enginuity 5875, there are still potential performance benefits to using striped thin metas over concatenated thin metas.
  • Symmetrix Enginuity has a logical volume write pending limit to prevent one volume from monopolizing writeable cache. Because each metamember gets a small percentage of cache, a striped meta is likely to offer more writeable cache to the metavolume.
However, concatenated thin metadevices do offer two important operational advantages over striped thin metadevices:

  • Non-metadevices can be converted to concatenated metadevices without destroying the existing data. Beginning with Enginuity 5875 and Solutions Enabler 7.2, it’s possible to convert a thin device into a concatenated meta without unmapping the device first.
  • Concatenated thin metadevices can be expanded without destroying existing data by adding metamembers to an existing metadevice. Beginning with Enginuity 5875, a striped metadevice can also be expanded while preserving existing data. 
    Pages 12-13
 


Tuesday, July 8, 2014

Before I drop all those SQL Server auto-generated column stats....

***** Update 20180628 *****
I'm leaving this post for historical reasons.  But if you want to parse plan XML for optimizer stats detail included due to trace flag 8666, please don't use the query included here.  Its ugly, has unintended cartesian products which the DISTINCT attempts to mitigate, and performs poorly.

Instead, use something more like the query at this new blog post:

Harvesting SQL Server Trace Flag 8666 stats from Query Plan XML

http://sql-sasquatch.blogspot.com/2018/06/harvesting-sql-server-trace-flag-8666.html

*****


***** Update 20170104 *****
Just verified that this still works on post-2016 vNext 1.1 :-)
*****

A while back I ran across an intriguing suggestion to simply drop all auto-created columns statistics in a database, by Ami Levin.  Read the blog post here:
http://blogs.msdn.com/b/mvpawardprogram/archive/2013/09/09/sql-server-auto-statistics-cleanup.aspx
 
I wouldn't want to drop all column statistics without an idea of which are being used and which are neglected.

A colleague asked me about evaluating the statistics in use on a running system.  At first all I remembered was trace flags 9204 & 9292 (see trace flag 2363 for SQL Server 2014 and beyond).  Paul White blogged about these here:
http://sqlblog.com/blogs/paul_white/archive/2011/09/21/how-to-find-the-statistics-used-to-compile-an-execution-plan.aspx

That allows investigating an individual query (an activity I always encourage for developers and report writers)... but not helpful for my context.

But in the comments to Paul's blog post Fabiano Neves Amorim showed up!  I read this blog post about a year ago... but without an immediate need it just kind of filed away in my mind somewhere near the plots and plot holes of old James Bond films I watched.  This time I paid attention :-)
http://blogfabiano.com/2012/07/03/statistics-used-in-a-cached-query-plan/

So, trace flag 8666 seems to be the key for what I want to do.  It allows peeking into 'InternalInfo' in the plan XML, in order to get table names (wszTable) and stat names (wszStatName) that were used to compile plans.  Cool!  So here's the disclaimer:

Trace flag 8666 is NOT an officially documented SQL Server trace flag.  I do not recommend using this trace flag without a tight leash!  In fact, any querying of the plan cache can become quite intensive - plans can be big and pulling lots of them into RAM can flush other database contents out while other folks are trying to get work done!  I know cuz I've caused trouble for folks - so do better than me!  Also, there's absolutely no guarantee that trace flag 8666 will continue to allow this insight in the future.


But, here's a new toy of mine, for profiling stats use on performance test servers when a full batch load of queries is loaded up.

There's no doubt in my mind that some folks reading this are WAY better at handling xquery or XML in general than I am.  If there are suggestions to improve the performance of this thing, I'm all ears...

But... here's my toy... in an early incarnation, anyways...
Note that in order to somewhat limit scope, only plans in the database of execution are considered based on sys.dm_exec_plan_attributes.

********************************************************************************************************

/* do NOT use in production unless CPU and memory utilization are well understood */
DBCC TRACEON(8666) with no_infomsgs;
;WITH 
   XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as p),
   handles (usecounts, plan_handle)
   AS
   (SELECT
       usecounts,
    plan_handle
    FROM 
       sys.dm_exec_cached_plans AS cp
    OUTER APPLY
       sys.dm_exec_plan_attributes(plan_handle) AS epa
    WHERE 
       cp.cacheobjtype = 'Compiled Plan'
       AND
       epa.attribute='dbid'
       AND
       epa.value=DB_ID())
SELECT DISTINCT
   TableUsed.XMLCol3.value('@FieldValue','NVarChar(500)') AS TableName
   ,StatsUsed.XMLCol5.value('@FieldValue','NVarchar(500)') AS StatsName
   ,UpdateThreshold.XMLCol6.value('@FieldValue','NVarchar(500)') AS ullThreshold
   ,handles.usecounts
   ,handles.plan_handle
FROM
   handles AS handles
CROSS APPLY
   sys.dm_exec_query_plan(handles.plan_handle) AS qp
CROSS APPLY
   qp.query_plan.nodes('//p:InternalInfo') AS InternalInfo(XMLCol1)
CROSS APPLY
   InternalInfo.XMLCol1.nodes('p:EnvColl/p:Recompile') AS Recomp(XMLCol2)
CROSS APPLY
   Recomp.XMLCol2.nodes('p:Field[@FieldName="wszTable"]') AS TableUsed(XMLCol3)
CROSS APPLY
   Recomp.XMLCol2.nodes('p:ModTrackingInfo') AS Tinfo(XMLCol4)
CROSS APPLY
   Tinfo.XMLCol4.nodes('p:Field[@FieldName="wszStatName"]') AS StatsUsed(XMLCol5)
CROSS APPLY
   Tinfo.XMLCol4.nodes('p:Field[@FieldName="ullThreshold"]') AS UpdateThreshold(XMLCol6)
WHERE
   qp.query_plan.exist('//p:InternalInfo') = 1
   AND
   InternalInfo.XMLCol1.exist('p:EnvColl/p:Recompile') = 1
   AND
   Recomp.XMLCol2.exist('p:Field[@FieldName="wszTable"]') = 1
   AND
   Recomp.XMLCol2.exist('p:ModTrackingInfo') = 1
   AND
   Tinfo.XMLCol4.exist('p:Field[@FieldName="wszStatName"]') = 1
   AND
   TableUsed.XMLCol3.value('@FieldValue','NVarChar(500)') NOT LIKE 'sys%'
ORDER BY
   handles.usecounts DESC, handles.plan_handle DESC;
DBCC TRACEOFF(8666) with no_infomsgs;