sql.sasquatch: I got a hunch about SQL Server fragmentation Part III

In my last post about SQL Server fragmentation, I showed that
avg_fragmentation_in_percent from sys.dm_db_index_physical_stats can report ~14% fragmentation for a clustered index with 7 data pages that is properly ordered and within a single extent. For the table/clustered index I used as an example, there is no operation that will remove the "fragmentation" reported.

You can see me previous musing here
I got a hunch about SQL Server fragmentation Part II
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server.html
If you're seeking more after this post, you can find the next one at
I got a hunch about SQL Server fragmentation Part IV
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server_14.html

What about a non-clustered index? I'll create one that's patently ridiculous - a nonclustered index that is redundant to the clustered index, including the varchar column. I'm using sort_in_tempdb to eliminate the possibility that intermediate sort runs are injecting fragmentation. MAXDOP is set to 1 in the index creation, to eliminate the possibility of fragmentation due to contention during allocation (my tests are the only activity in this instance of SQL Server).

CREATE NONCLUSTERED INDEX nci_sq_CI 
       ON sq_CI (ID_INT ASC)
       INCLUDE ( varchar_4096 )
       WITH ( SORT_IN_TEMPDB = ON, MAXDOP = 1 );

SELECT object_name(pstats.object_id) AS tname,
       index_type_desc,index_level,
       avg_fragment_size_in_pages,
       fragment_count,page_count,
       avg_fragmentation_in_percent 
FROM   sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,'DETAILED') AS pstats
WHERE  pstats.object_id IN (object_id('sq_CI'))

That makes sense. I've constructed the clustered index and the nonclustered index so they have the same contents, in the same order.

So full scans of the CI and the NCI should look the same - same number of bytes read, same number of reads. Let's check it out.

IF OBJECT_ID('tempdb..#blackhole') IS NOT NULL DROP TABLE #blackhole 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SELECT * INTO #blackhole FROM sq_CI WITH (INDEX(pk_sq_CI)) WHERE varchar_4096 like '%b%' --CI fullscan
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SET STATISTICS IO OFF;

Full scan of the clustered index: 1 read, 65536 bytes. IO statistics agrees.

OK, now a full scan of the nonclustered index. Isn't this just glorious fun?

IF OBJECT_ID('tempdb..#blackhole') IS NOT NULL DROP TABLE #blackhole 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SELECT * INTO #blackhole FROM sq_CI WITH (INDEX(nci_sq_CI)) WHERE varchar_4096 like '%b%' --NCI fullscan
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SET STATISTICS IO OFF;

Well now!! That's peculiar! More peculiar than I can dig into for today. 3 reads and 180224 read bytes reported by sys.dm_io_virtual_file_stats. Note IO statistics read-ahead count 14 is higher than logical read count 9. No time for that today :-)

Read-ahead count 14 is also by itself higher than the 3 reads reported by sys.dm_io_virtual_file_stats. Only got time today to give a quick explanation, not demonstrate: in IO statistics, read-ahead reads is a count of pages that were read by read-ahead - not a count of the read operations that were performed. And the read operations which handled the read-ahead reads are not in the physical read count reported by IO statistics - they are in addition to that count. So sys.dm_io_virtual_file_stats reported 3 reads: one read accounted for by IO statistics physical reads = 1, and two of those reads accomplished the read-ahead of the 14 pages reported by IO statistics as read-ahead reads = 14.

OK. Let's go to sys.dm_db_database_page_allocations to figure this out.

SELECT 
extent_page_id,allocated_page_page_id,is_allocated,
page_type_desc,page_level,next_page_page_id,previous_page_page_id
FROM sys.dm_db_database_page_allocations(DB_ID(),object_id('sq_CI'),2,NULL,'DETAILED')
WHERE page_type_desc <> 'IAM_PAGE'

So page 664 is the first page in an extent, and its the level 1 page. The 7 level 0 pages are in logical order:
624=>632=>633=>634=>635=>636=>637
The first page is by itself in an extent. The next 6 level 0 pages are in the 3rd extent.
3 extents. 3 reads. Same reported avg_fragmentation_in_pct as the clustered index with all 7 data pages and an index page in a single 64kb extent - against which a fullscan required 1 read of 65536 bytes.

Turns out in this case, sys.dm_db_index_physical_physical_stats alone doesn't help predict the difference in behavior for the nonclustered index and the clustered index.

Now what happens when I rebuild the NCI?

ALTER INDEX NCI_sq_CI ON sq_CI
REBUILD
WITH 
(SORT_IN_TEMPDB = ON,
 MAXDOP = 1,
 ONLINE = ON);

IF OBJECT_ID('tempdb..#blackhole') IS NOT NULL DROP TABLE #blackhole 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SELECT * INTO #blackhole FROM sq_CI WITH (INDEX(nci_sq_CI)) WHERE varchar_4096 like '%b%' --NCI fullscan
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SET STATISTICS IO OFF;

All right! 1 read now for the full scan, 65536 bytes. But, what about sys.dm_db_index_physical_stats?

SELECT object_name(pstats.object_id) AS tname,
       index_type_desc,index_level,
       avg_fragment_size_in_pages,
       fragment_count,page_count,
       avg_fragmentation_in_percent 
FROM   sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,'DETAILED') AS pstats
WHERE  pstats.object_id IN (object_id('sq_CI'))

Oh. So rebuilding the NCI got all of the pages into a single extent. Full scan of the NCI dropped from 3 physical reads for a total of 180224 bytes to 1 read for a total of 65536 bytes. But sys.dm_db_index_physical_stats gives no hint of that change in read behavior - it looks exactly as it did before.

On the other hand, sys.dm_db_database_page_allocations shows what happened with the index rebuild.

SELECT 
extent_page_id,allocated_page_page_id,is_allocated,
page_type_desc,page_level,next_page_page_id,previous_page_page_id
FROM sys.dm_db_database_page_allocations(DB_ID(),object_id('sq_CI'),2,NULL,'DETAILED')
WHERE page_type_desc <> 'IAM_PAGE'

OK, now I really have to go for today...

sql.sasquatch

Sunday, September 13, 2015

I got a hunch about SQL Server fragmentation Part III

No comments:

Post a Comment

sql.sasquatch Archive