Tuesday, February 23, 2016

SQL Server: Who on Earth would rebuild an index at 0.44% fragmentation?!?

*Warning* I know, I know... columnstore is all the rage.  This post is strictly rowstore stuff, sorry. :-)
Within SQL Server, 8k database pages are organized into 8 page extents. 
Here's a good introductory resource.
Understanding Pages and Extents
The idea of 'mixed extents' is an important one - especially because future mixed extents in user hobts can be eliminated via trace flag 1118.  (Yes, tempdb uses T1118 by default in SQL Server 2016 but it seems that in user databases if you want to avoid mixed extents, gotta still use T1118.) 
The idea of fragmentation for databases is somewhat complex - and sometimes downright controversial.  In various contexts some or all of the following concepts may be involved:
1. Contiguity of database free space within data files.  This isn't discussed very often these days, but look back in the Oracle forums: especially in some of the flame-throwing fragmentation discussions of old and it comes up.
2. Order of index pages compared to order of organizing index key.  This idea is extremely popular for SQL Server.
3. Compactness of contents within data pages.  This idea is starting to gain traction in SQL Server circles, largely due to work by Paul Randal and Jonathan Kehayias, showing that low data density in hobt pages wastes database cache and by extension CPU. 
4. Contiguity of database pages within database files.  Assume that SQL Server data files are on Windows volumes 1:1 mapped to underlying LUNs (rather than Storage Spaces, dynamic disk striped volume, etc).  Maximum read size will be determined by interaction of contiguous hobt pages, maximum transfer size allowed to disk storage (usually a limit of the adapter driver) and various rules about read requests in the database engine.*
On a high IOPs system, or a system where readahead read bursts stress disk IO queue depth, increasing average read size and decreasing the number of necessary read operations can be an important part of tuning.
*For example, backup reads don't need to be concerned about boundaries between hobts, but when a query requests reads from table A index B, pages from table C won't be returned even if it means lowering the total read operation count.
Let's take a look at an example.
I've got table TABLE1, with clustered index CI_TABLE1.  There's also the cleverly named nonclustered index, NC_TABLE1.
My database has 8 equally sized data files in the primary filegroup.  Each database file is on its own LUN; that's how I roll.
Well - let's look at the stalwart foundation of determining when to defragment an index.
SELECT s_tables.name AS table_name,
s_indexes.name AS index_name,
CAST(pstats.avg_fragmentation_in_percent AS NUMERIC(18,2)) AS avg_frag_pct,
CAST(pstats.avg_fragment_size_in_pages AS NUMERIC(18,2)) AS avg_frag_pages,
CAST(pstats.avg_page_space_used_in_percent AS NUMERIC(18,2)) AS avg_page_used_pct,
FROM sys.tables AS s_tables
JOIN sys.indexes AS s_indexes
ON s_indexes.[object_id] = s_tables.[object_id]
s_indexes.index_id,NULL,'DETAILED') AS pstats
WHERE s_tables.name = 'TABLE1';
That looks pretty good, huh? 
Who on earth would rebuild an index with 0.44% fragmentation?
I'm your huckleberry.
I'm not the guy you want to write queries... but I will write queries in a pinch.  Usually it's because I can't find a query that does what I want.
So... I wrote this query.
DECLARE @table_name VARCHAR(256);
SET @table_name = 'TABLE1'
SELECT subQ3.contig_extent_run_len,
       count(*)  AS contig_extent_run_count,
      (SELECT subQ2.table_name,
              count(*) AS contig_page_run_count,
              CASE WHEN contig_page_run_len % 8 = 0
                        THEN contig_page_run_len / 8
                   ELSE 1 + (contig_page_run_len / 8)
              END      AS contig_extent_run_len
            (SELECT subQ1.table_name,
                 min(subQ1.page_id)             AS contig_page_run_beg,
                 max(subQ1.page_id)             AS contig_page_run_end,
                 1 + max(subQ1.page_id)
                   - min(subQ1.page_id)
                                                AS contig_page_run_len
                  (SELECT s_tables.name                  AS table_name,
                          s_indexes.name                 AS index_name,
                          allocated_page_file_id         AS fileId,
                          allocated_page_page_id         AS page_id,
                       - ROW_NUMBER() OVER(PARTITION BY allocated_page_file_id
                                              ORDER BY allocated_page_page_id)
                                                         AS contig_cluster_grouper,
                          extent_page_id/8               AS extent_num,
                   FROM   sys.tables                     AS s_tables
                   JOIN   sys.indexes                    AS s_indexes
                     ON   s_indexes.[object_id] = s_tables.[object_id]
                   CROSS APPLY
                   WHERE  s_tables.[object_id] = object_id(@table_name)
                   ) subQ1
             GROUP BY subQ1.table_name,
             ) subQ2
       GROUP BY subQ2.table_name,
      ) subQ3
GROUP BY subQ3.table_name,
What does that mean?  In this SQL Server instance, trace flag 1118 is preventing mixed extents.  The query uses a 'gaps & islands' approach to group index pages into runs of contiguous extents within each database file.  Then counts the number of times that runs of a single extent, two contiguous extents, and so on occur.  The NC_TABLE1 index is composed of 5 runs.  Two of them are single extents.  One run is 4 contiguous extents of the NC_TABLE1 index.  There's also a run of 7 extents, and one run of 23 extents.
Is that good?  NC_TABLE1 is 36 total extents.  288 eight k pages.  2.25 mb. It can be read in 5 reads - one read for each contiguous run.
CI_TABLE1 is comprised of 48 extents.  3 mb. It can be read in 11 reads - again, one for each contiguous run.
The SQL Server instance has the -E startup option enabled.  Without that startup option, proportional fill algorithm would distribute incoming data to the 8 data files with a potential smallest allocation of a single 64k extent before rotating round-robin to the additional files in the filegroup.  With the -E startup option, the smallest possible new allocation by proportional fill is sixty four 64k extents - 4 mb.
That means if I can rebuild the indexes into new space at the tail of the files, the contiguity should be improved considerably.
Yeah, I tried it :-)
Now what does sys.dm_db_index_physical_stats look like? Well - I lost a level in the nonclustered index.  Yeah, the fragmentation numbers look a little better - but they were already so good no-one would bother with a rebuild.
And what do results to my ugly 'gaps and islands' query look like now? 
With -E, each pass of the proportional fill algorithm will send up to 64 extents of new allocations to each file.  There was no other DML activity on the server, and I ran the rebuild at DOP 1.  There actually are 3 contiguous runs of CI_TABLE1at or approaching 64 extents.  NC_TABLE1 now has all of its data pages in a single contiguous run of 21 extents.
I've got more research to do, and more work to do on the contiguity query (it doesn't distinguish between reserved but unallocated pages,  doesn't distinguish between data pages and index pointer pages, etc).
But I wanted to finally get this blog post out here - ideas I started working with in my "Hunch about SQL Server Fragmentation" series.  For very disk intensive workloads, there's an aspect of fragmentation missing from the threshold people use to decide when to rebuild/reorg.  And that same aspect of fragmentation is largely missing from the evaluation of effectiveness of rebuild/reorg.



No comments:

Post a Comment