Monday, April 11, 2016

A few long-winded thoughts about SAN snapshots & #SQLServer backups

SAN snapshots can be split into two categories: dependent snapshots and independent snapshots.

*****
These days, almost all references to snapshots are references to dependent snapshots.

So what do these dependent snapshots depend on?  They depend on the media storing the base image to produce a full, recoverable image.  Dependent snapshots are point-in-time snapshots that are stitched together from the base image and a space-efficient set of disk-level blocks representing point-in-time contents that have since changed.

There are two main implementations of dependent snapshots: copy-on-write and redirect-on-write.
Copy-on-write is fairly intuitive.  Imagine a 1 GB database data file.  Take a dependent COW snapshot, and initially its entirety is metadata - it has no delta data in it yet.  Change an 8kb block or page in the middle of the file... now before the changed block can be written to the underlying disk its previous contents must be retrieved from disk (or cache) and written to the dependent snapshot.  Only after the previous value is written to the snapshot can the new value be written to the underlying file.  Often a COW snapshot will be written to a different set of virtual or even physical disks than the base data.  Some arrays (used to, at least) allow the COW to be stored on the same disks as the underlying base.

One tricky item to note is that the delta is read from disk (or cache) and written to the snapshot in *its* defined block size.  Want to change an 8kb database block/page within a file?  Fine.  But if the snapshot change block size is 64kb, 64kb of contents will be read(or fetched from cache) and written before the new 8kb is written.  That's not too surprising to folks that have worked with various storage - often the disk block size, RAID group stripe size, and array data chunk handling size (they've got tons of vendor and vernacular names from "chubby chunk" to who-knows-what) are different than host request sizes.  I'm just mentioning it here for the sake of completeness.

Redirect-on-write snapshots offer similar point-in-time recovery capabilities as COW, and are also dependent on the original base data to produce a full recoverable image.  But the specifics of what happen during a write are different.  Snapshot the 1 GB database file again, this time with a ROW snapshot.  Establishing the snapshot means: start keeping metadata for the layout of the base data at this point-in-time.  When the 8kb block in the middle gets changed, it doesn't get "written in place".  Its redirected instead - written in a new location. The data from point-in-time is stitched together from the metadata associated with that PIT, and current data is stitched together from the current metadata.  Sometimes the relationships can get quite complex - metadata trees, etc.  But that's the general idea.  What happens when the redirect-on-write PIT is discarded?  Two separate strategies:
1. discard the metadata associated with the PIT, and don't do anything else.
2. merge the delta data into the base data from PIT - so the retention of the PIT just delays the eventual write-in-place of the delta data between establishing the PIT and discarding the PIT.

The first method above is used by NetApp snapshots, and snapshots of some other redirect-on-write or "shadow paging" filesystems.  I'm not sure if anyone uses the second, merge method anymore.
*****
What about independent snapshots?  I hear these called "clones" a lot.  I try not to use that term too much, because in the parlance of some array vendors a "clone" is still a dependent snapshot.  Not because they are trying to pull the wool over anyone's eyes - just because some storage terms have not been standardized across the industry.

The idea behind an independent snapshot is "after the independent snapshot is complete, point-in-time recovery from snapshot is possible even with loss of the underlying base."

Sometimes you'll hear about people "splitting a mirror" or some such.  Very likely they are talking about establishing an independent snapshot.

Array level independent snapshots can come in two different flavors:
1. complete image (and receive all changes) before and until establishing PIT
2. complete image after establishing PIT (COW and fill in everything around it)

"Splitting a mirror", "splitting a BCV", etc implies establishing an independent snapshot that is complete on establish.  Imagine a RAID 5 set of disks, mirrored to another RAID 5 set of disks.  When you want an independent snapshot, you split the mirror.  One side remains active and continues to accept changes from attached hosts.  The other side is frozen at a point-in-time.  As long as proper procedures for contents were followed (quiescing databases, flushing host level cache including filesystem cache, issuing filesystem/volume/group/consistency group freezes, etc) the PIT is a recoverable image of the base.

Lets say that the PIT is then copied to an enterprise backup manager, like Commvault, Tivoli TSM, etc.  Then that same set of disks can be re-used for an independent snapshot of the same base disks.  But... changes have been introduced since it was split.  Some arrays use change tracking (or even two-way change tracking) after the independent snapshot has been split.  After re-joining the PIT disks, only the tracked changes need to be updated on the PIT for it to be up-to-date and ready to be split off again for another independent snapshot PIT.  That can be pretty efficient :-)  But - it does still need some amount of write activity, copying stuff from the base.

Other systems re-create the entire image every time the independent snapshot PIT is attached to a base.  1 TB database in the base?  Join an independent snapshot resource to it in this strategy, and the full 1 TB must be copied to it (and any changes in the interim) before the independent snapshot is "caught up".  That can be pretty expensive.  Systems I've known that do this allow the activity to be throttled - which means it takes longer for the PIT resource to be complete.

The other method of independent snapshot is to establish the snapshot and *then* complete the full recoverable image.  Changes to the base incur COW overhead for first writes until the independent snapshot is complete.  Eventually the PIT image is complete.  This method has a scheduling advantage.  After its established, the enterprise backup manager can immediately start reading from it.  Of course, there's the COW overhead on the base until the image is complete.  And reads from the enterprise backup manager that "miss" data in the not-yet-independent snapshot will read from the underlying base.

My preference from this group of independent snapshot possibilities is an implementation that uses advanced change tracking, fills in the image on join, and is fully independent when PIT is established.  Three's some additional write load to sync the changes from the last time a PIT was established, but after that the overhead is minimal.  And once the PIT is established, the read load from the backup manager is against the PIT, no additional reads against the base.


*****

 So why did I write this?
 I wanted to provide some background my answer to this question: Are array snapshots of SQL Server dependent on production data?

My answer: some snapshots are no more dependent on the production data than SQL Server full backups.

Dependent snapshots certainly are dependent on the production data - in the same way that a SQL Server differential backup is dependent on a corresponding full backup.

But independent snapshots are only dependent on production data until their image is complete.  Independent snapshots have no more dependency on production data than SQL Server full backups do.  My opinion is that independent snapshots with advanced change tracking are *less* dependent on production data than SQL Server full backups are.  Because when re-syncing they don't need to copy all data - just the delta since the last time they were synced.  My opinion is in such cases, if the lower amount of writes to sync means less time before the independent snapshot is complete than a SQL Server full backup requires - the independent snapshot is less dependent on production data.

*****
OK.  Now that all those words are typed.  What is my backup preference for SQL Server?  Personally, unless there is an overriding factor I like native SQL Server backups with a target isolated from the production data.

Denny Cherry's article "Why aren’t SAN snapshots a good backup solution for SQL Server?" expresses my feeling pretty succinctly in the 4th rule :-)
"Fourth rule of backups: Whoever's butt is on the line when the backups can’t be restored gets to decide how the data is backed up."
 http://itknowledgeexchange.techtarget.com/sql-server/arent-san-snapshots-good-backup-solution/ 

Of course, in a given situation backup time/scheduling, available disk or other resources, or data center direction could motivate even me to believe an array-level snapshot may be a better choice.

That's enough for today - I'll avoid the controversial matter of when a dependent vs independent snapshot is preferable.

No comments:

Post a Comment