Wednesday, April 26, 2017

SQL Server - DBCC Clonedatabase vs Scripting Objects

Although the 'dbcc clonedatabase' command is new, the idea of clone databases as a 'dataless' database but including table & index definitions as well as optimizer statistics is not new.

Here's a blog post from May of 2010 describing the longstanding strategy of using the scripting functionality from SSMS to create a statistics clone database in SQL Server.

How to create a SQL Server Clone Statistics Only Database
https://www.mssqltips.com/sqlservertip/1999/how-to-create-a-sql-server-clone-statistics-only-database/

I wanted to put this scripting option in SQL Server/SSMS 2016 head-to-head against the clonedatabase command to see how everything turns out.  Why not do that in front of a crowd of 250 people?  (Well - not really.  I did run the clonedatabase in front of everybody though.)

Here's what was in the source database.  The scripting option won't let me script out the Query Store, of course.  But clonedatabase command ill include it unless the option to exclude is specified.



So, to make sure it was apples-to-apples comparison between the 'clone' methods, I toggled a few of the advanced scripting options.



Then I kicked it off. Thank goodness it was several hours before my presentation. I bet a dollar with a colleague over how long it would take to complete. He guessed 56 minutes. I thought 1 hour 25 minutes. Thank goodness it was "Price is Right" rules, cuz we were both way off. It took 2 hours and 2 minutes. Now I'm $1.00 richer.😄

The end result of the scripting exercise was this file clocking in at over 500 mb. I haven't loaded it into a target database yet - I'll report how long it too and resulting database size once I have.



Then... during my presentation late in the day, in front of about 250 people, I launched the dbcc clonedatabase command. It took 37 seconds. Yayy!! 😁

The end result? 1096 mb of transaction log, and 896 mb of row files.  (Remember that 666 mb of the data in the row files is Query Store.)



Let's take a look at the inventory of the Query Store. There's a lot of stuff in there.



There are some concerns with clonedatabase. Sensitive data may be included in Query Store query plans/text if sensitive values are used as literals in queries. Sensitive data may be included in optimizer statistics on sensitive columns. Intellectual property can be a concern in the stored procedures or functions included. Intellectual property may even be a concern for schema design.

I will point out that the only new concern (that I'm aware of) with the dbcc clonedatabase command, versus the scripting option is the possibility of sensitive data included in Query Store query plans and query text.

There's a very good post below for security/sensitivity considerations.

Understanding data security in cloned databases created using DBCC CLONEDATABASE
https://blogs.msdn.microsoft.com/sql_server_team/understanding-data-security-in-cloned-databases-created-using-dbcc-clonedatabase/

In spite of the security/sensitivity considerations, I'm really excited about the clonedatabase command.  Not just because of the incredible speed bump compared to scripting.  Also because including Query Store in the clone is a natural fit.

Even if security/sensitivity issues complicate sharing a clone database with Microsoft or other off-site vendors, the value of using a clone database for investigating query shape, cardinality estimates, or even query store investigation decoupled from a production database should be high.

1 comment:

  1. Yeah, the speed is definitely excellent. It also avoids issues with FK order of creation that happens when scripting is used. Good stuff.

    ReplyDelete