OK. This is a little test in my SQL Server 2019 CU11 instance. But the behavior I will demonstrate goes way back... and seems like it just may go way forward, too.
First, let's create a test database, with a nice little 3 column CCI table in it.
USE [master]; GO CREATE DATABASE [test_SI] ON PRIMARY ( NAME = N'test_SI' , FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL15.V2019CU11\MSSQL\DATA\test_SI.mdf') LOG ON ( NAME = N'test_SI_log' , FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL15.V2019CU11\MSSQL\DATA\test_SI_log.ldf'); ALTER DATABASE [test_SI] SET ALLOW_SNAPSHOT_ISOLATION ON; GO USE [test_SI] DROP TABLE IF EXISTS dbo.SI_DeletedRows; CREATE TABLE dbo.SI_DeletedRows ( x_key [bigint] NOT NULL , x_value [int] NULL , x_comment [varchar](512) NULL , INDEX [CCI__SI_DeletedRows] CLUSTERED COLUMNSTORE );
Let's take 2 minutes (or less) to insert 104,203,264 rows into the clustered columnstore index created.
;WITH sixteen AS ( SELECT num = 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1UNION ALL SELECT 1 UNION ALL SELECT 1UNION ALL SELECT 1 UNION ALL SELECT 1UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1)
INSERT INTO dbo.SI_deletedRows
SELECT ROW_NUMBER() OVER (ORDER BY s1.number DESC), s1.number, s1.name
FROM master..spt_values s1
CROSS JOIN master..spt_values s2CROSS JOIN sixteen;
/* 104203264 rowsmin - 1:37max - 1:56 */
All right - now let's check on the rowgroup quality. Pretty nice; 100 rowgroups. 99 of them at maximum size, and 1 with almost 400,000 rows.
SELECT csrgps.state_desc, num_groups = count(*), rows_in_group = csrgps.total_rows , csrgps.trim_reason_desc, csrgps.transition_to_compressed_state_desc FROM sys.dm_db_column_store_row_group_physical_stats csrgps WHERE csrgps.object_id = object_id('dbo.SI_DeletedRows') GROUP BY csrgps.state_desc, csrgps.total_rows, csrgps.trim_reason_desc , csrgps.transition_to_compressed_state_desc;
Let's take a minute and delete 1 row out of every 15 from the CCI.
DELETE FROM dbo.SI_DeletedRows WHERE x_key % 15 = 0; /* 6946884 rows, 6.67% 0:51 */
Here's a quick little query against the CCI, in Read Committed transaction isolation level. The query is run all by itself on the instance, at DOP 1.
SET TRANSACTION ISOLATION LEVEL Read Committed SET STATISTICS TIME ON SELECT COUNT(1) FROM dbo.SI_DeletedRows WHERE x_comment IN ('money', 'smallmoney', 'offline', 'Little Vito') OR x_comment IS NULL OPTION (MAXDOP 1); /* SQL Server Execution Times: CPU time = 312 ms, elapsed time = 327 ms. */
Let's run the query again at DOP 1. But change the isolation level to Snapshot. No other queries are running on the instance, so there is no competition for resources. There is nothing in the version store, and there are no other open transactions against the objects in the query. Snapshot isolation all by itself is responsible for increasing the CPU time and the elapsed time of this query on a CCI with deleted rows by a factor of eight.
SET TRANSACTION ISOLATION LEVEL SNAPSHOT SET STATISTICS TIME ON SELECT COUNT(1) FROM dbo.SI_DeletedRows WHERE x_comment IN ('money', 'smallmoney', 'offline', 'Little Vito') OR x_comment IS NULL OPTION (MAXDOP 1); /* SQL Server Execution Times: CPU time = 2625 ms, elapsed time = 2638 ms. */
I've seen this behavior on production systems. But in the field, the effect can be even more pronounced than this example. I've seen more complicated queries involving CCIs have snapshot isolation increase their runtime from under 1 second to over 60 seconds.
That's too bad, since snapshot isolation is a canonical solution that can facilitate read analytics queries against the same database supporting read-write workloads.
If Snapshot Isolation does not meet workload goals, often RCSI is evaluated as an option. In this case, RCSI introduces the same performance overhead as snapshot isolation.
An Availability Group readable secondary also experiences the same performance overhead.
Removing all deleted rows from a CCI allows CCI queries in Read Committed and Snapshot Isolation levels to perform comparably (assuming no competition for resources or locks). But, CCI reorgs will only remove deleted rows from rowgroups with at least 10% of their rows deleted. And CCI rebuilds may not be appropriate given resource and other constrains.
No comments:
Post a Comment