Thursday, August 28, 2014

Why the read bytes/sec valleys in SQL Server backup to NUL?



It only took about a year of elapsed time for me to figure out the explanation below:-)

The perfmon graphs below are from backup to NUL on the same performance test rig as my previous post (http://sql-sasquatch.blogspot.com/2014/08/ntfs-cache-sql-server-2014-perfmon.html).  No other significant activity within SQL Server or on the server at the time.  1 hour eight minutes 59 seconds to backup 2.25 TB of database footprint.  Not bad, but why does bytes/sec throughput dip so much at a few spots?  Especially when reads/second have increased?



Maybe because read latency is higher then? Nope, read latency trends roughly with bytes/second.  Latency tends to be lower when bytes/second is lower. 


This behavior during SQL Server native backup puzzled me to no end when I first saw it on a live system about a year ago.  I thought it was due to high write cache pending on one controller of a dual controller array.  High write cache pending typically results in increased read and write latency.  My theory was that the higher write latency could lead to backup in-memory buffer saturation, in turn throttling read activity and read bytes/sec.  But in this inhouse testing, high write cache pending is eliminated as a potential throttle due to backup target "DISK=N'NUL'".

The reverse correlation of reads/sec with sec/read is even stronger than the positive correlation of bytes/sec and sec/read.  Cherry-red latency is lowest when brick-red reads/sec is highest.


So, what’s happening, and how can it be remedied?  I haven’t nailed down the specifics yet*, but this is due to fragmentation/interleaving.  How do I know?  Because it was nonexistent when the database was pristine after creation for testing purposes, and increased after a few things I did to purposely inject fragmentation :-).  When the fixed number of backup read threads hit fragmentation of some type(s), they start issuing smaller reads.  The smaller reads are met with lower read latency, but the tradeoff still results in noticeable dips in bytes/second throughput.

How can it be fixed?  In this specific case, if the reader threads are tripled, most of the time there will be increased read & write latency and not much increase in bytes/sec since its limited by SAN policy to 800 mb/sec max on this particular system.  But, during those noticeable throughput valleys (which bottom out at 240 mb/sec), the higher number of threads will submit more aggregate small reads/sec… resulting in higher bytes/sec (maybe even close to 720 mb/sec during the valleys since current thread count hits 240 mb) and shorter elapsed time for the backup.  I’ll try a few things, and there will be a followup post in the future.

*Not sure if this is due to free extents only yet, or if free pages within allocated extents also cause smaller backup reads.  I don’t think that backup reads will  be smaller for extents that are fully page-allocated but have interleaved tables and indexes among them.  More testing needed to find out specifically what kind of interleaving is causing this.

*****

After I published I noticed I hadn't included any of the per LUN measures from this backup.  8 equally sized LUNs, each with an equally sized row file for the database that's being backed up.  A nearly equal amount of data in each of the files.  And the behavior?  Almost exactly the same on each LUN at all points of the backup to NUL.













No comments:

Post a Comment