For a great introduction to this topic, please see this post at SQLSkills from Joe Sack.
Temporary Statistics for Database Snapshots in SQL Server 2012
https://www.sqlskills.com/blogs/joe/temporary-statistics-for-database-snapshots-in-sql-server-2012/
Today's test instance is running SQL Server 2019 CU6.
OK, let's create a test database.
USE [master]
GO CREATE DATABASE [snap_test_src] CONTAINMENT = NONE ON PRIMARY ( NAME = N'snap_test_src', SIZE = 8192KB , MAXSIZE = UNLIMITED, FILEGROWTH = 65536KB , FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\DATA\snap_test_src.mdf') LOG ON ( NAME = N'snap_test_src__log', SIZE = 8192KB , MAXSIZE = 2GB , FILEGROWTH = 65536KB , FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\DATA\snap_test_src__log.ldf') ALTER DATABASE [snap_test_src] SET AUTO_CREATE_STATISTICS ON ALTER DATABASE [snap_test_src] SET AUTO_UPDATE_STATISTICS ON ALTER DATABASE [snap_test_src] SET AUTO_UPDATE_STATISTICS_ASYNC OFF ALTER DATABASE [snap_test_src] SET ANSI_NULL_DEFAULT OFF ALTER DATABASE [snap_test_src] SET ANSI_NULLS OFF ALTER DATABASE [snap_test_src] SET ANSI_PADDING OFF ALTER DATABASE [snap_test_src] SET ANSI_WARNINGS OFF ALTER DATABASE [snap_test_src] SET ARITHABORT OFF ALTER DATABASE [snap_test_src] SET AUTO_CLOSE OFF ALTER DATABASE [snap_test_src] SET AUTO_SHRINK OFF ALTER DATABASE [snap_test_src] SET CURSOR_CLOSE_ON_COMMIT OFF ALTER DATABASE [snap_test_src] SET CURSOR_DEFAULT GLOBAL ALTER DATABASE [snap_test_src] SET CONCAT_NULL_YIELDS_NULL OFF ALTER DATABASE [snap_test_src] SET NUMERIC_ROUNDABORT OFF ALTER DATABASE [snap_test_src] SET QUOTED_IDENTIFIER OFF ALTER DATABASE [snap_test_src] SET RECURSIVE_TRIGGERS OFF ALTER DATABASE [snap_test_src] SET DISABLE_BROKER ALTER DATABASE [snap_test_src] SET DATE_CORRELATION_OPTIMIZATION OFF ALTER DATABASE [snap_test_src] SET TRUSTWORTHY OFF ALTER DATABASE [snap_test_src] SET ALLOW_SNAPSHOT_ISOLATION OFF ALTER DATABASE [snap_test_src] SET PARAMETERIZATION SIMPLE ALTER DATABASE [snap_test_src] SET READ_COMMITTED_SNAPSHOT OFF ALTER DATABASE [snap_test_src] SET HONOR_BROKER_PRIORITY OFF ALTER DATABASE [snap_test_src] SET RECOVERY FULL ALTER DATABASE [snap_test_src] SET PAGE_VERIFY CHECKSUM ALTER DATABASE [snap_test_src] SET DB_CHAINING OFF ALTER DATABASE [snap_test_src] SET FILESTREAM( NON_TRANSACTED_ACCESS = OFF ) ALTER DATABASE [snap_test_src] SET TARGET_RECOVERY_TIME = 60 SECONDS ALTER DATABASE [snap_test_src] SET DELAYED_DURABILITY = DISABLED ALTER DATABASE [snap_test_src] SET QUERY_STORE = OFF ALTER DATABASE [snap_test_src] SET READ_WRITE ALTER DATABASE [snap_test_src] SET MULTI_USER
USE snap_test_src; CREATE TABLE stats_test(col1 INT NOT NULL, col2 INT NOT NULL); ;WITH num AS (SELECT TOP (1000) n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM master..spt_values) INSERT INTO stats_test SELECT n, 1001 - n FROM num;
SELECT * FROM stats_test where col1 < 3;
SELECT ss.name, ss.stats_id, ss.auto_created, ss.user_created, ss.is_temporary , sp.last_updated FROM snap_test_src.sys.stats ss CROSS APPLY snap_test_src.sys.dm_db_stats_properties(ss.object_id, ss.stats_id) sp WHERE ss.object_id = object_id('stats_test');
Now let's make that statistic stale 😆 A thousand more numbers oughtta do it.
USE snap_test_src; ;WITH num AS (SELECT TOP (1000) n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM master..spt_values) INSERT INTO stats_test SELECT n, 1001 - n FROM num;
All right. Got a table with 2000 rows and a single auto-created statistics object that's now stale.
Let's create a snapshot database!!
CREATE DATABASE snap_test ON ( NAME = snap_test_src, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\DATA\snap_test_src.ss') AS SNAPSHOT OF snap_test_src;
OK... now for a query in the snapshot database...
USE snap_test SELECT * FROM stats_test WHERE col1 < 3 AND col2 > 998;
Let's check those stats... now there are two stats and both are marked as temporary.
SELECT ss.name, ss.stats_id, ss.auto_created, ss.is_temporary , sp.last_updated FROM snap_test.sys.stats ss CROSS APPLY snap_test.sys.dm_db_stats_properties(ss.object_id, ss.stats_id) sp WHERE ss.object_id = object_id('stats_test');
Let's take another look, and compare the stats in the source database and the snapshot database.
(*** update 2020 September 24 ***
this little trick where I reference dm_db_stats_properties in a different database by adding the database name to the reference works in SQL Server 2019 CU6. But it didn't work on a SQL Server 2016 instance I tried the same trick on - looks like both references to the function resolved to the database context my query was in.)
USE snap_test SELECT ss.name, source_rows = sp2.rows, source_sample = sp2.rows_sampled , snap_rows = sp.rows, snap_sample = sp.rows_sampled FROM snap_test.sys.stats ss CROSS APPLY snap_test.sys.dm_db_stats_properties(ss.object_id, ss.stats_id) sp OUTER APPLY snap_test_src.sys.dm_db_stats_properties(ss.object_id, ss.stats_id) sp2 WHERE ss.object_id = object_id('stats_test');
Consider a snapshot database which is created daily. The purpose is to provide analytics reporting access while maintenance or data loads take place in the source database. In the snapshot database, analytics reports have no locking concerns from the activity in the underlying source database. And the temporary statistics provided by SQL Server, combined with the statistics inherited from the source database, provide a lot of information to the optimizer for query plan selection.
But what if significant fact tables are queried in the snapshot and leave a situation like col2 in stats_test? A column which generates an auto-created stat in the snapshot, but never gets a statistic created in the source database. Each day, the cost of creating that statistic and every statistic like it will be paid as part of the workload. Even if the underlying table is a now-stable dimension.
q
If a snapshot database is being used to provide reporting access, it may be valuable to check for temporary statistics that are perpetually being created - or updated - in the snapshot. In some cases, creating or updating the statistic in the underlying source database may provide a benefit by eliminating redundant work.
No comments:
Post a Comment