Friday, August 28, 2015

I got a hunch about SQL Server Fragmentation Part I

I'm about a year behind in blogging, maybe more.  If I split my ideas into smaller bites, and make a series when appropriate, perhaps I'll get more blogging done.  Seems to work for me on Twitter :-)

If you hunger for more of my musing about SQL Server fragmentation, my next post is at:
I got a hunch about SQL Server fragmentation. Part II
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server.html

*****
(September 15, 2015) If you've read this post in the past, and it seems kinda different now... you're right!!  After posts 2-5 in the series I came back and re-factored :-) Trying to include html code windows for all of my queries now, simplified the example table structure a little, and so on...   
*****

I had a hunch about achieving maximum possible logical fragmentation in a SQL Server clustered index based on some work I'd done in a different database engine.  Finally decided to put my hunch to the test.

In a brand new database with a single data file, I created my test table.  The table is very simple, constructed so that each row will consume 1 8kb leaf page due primarily to the 4096 character varchar column.  Who'd really create a table like that?  And why wouldn't they page compress it??  Come on, I'm only an enthusiast and this is my sandbox :-)


CREATE TABLE [dbo].[sq_dashE_1]
            ([ID_INT] [int] NOT NULL,
             [VARCHAR_4096] [varchar](4096) NULL
             CONSTRAINT [PK_sq_dashE_1] 
    PRIMARY KEY CLUSTERED([ID_INT] ASC) ON [PRIMARY]
             ) ON [PRIMARY];

I inserted 4084 rows in the table, starting with clustered index key value 4084 and subtracting one along the way, down to one. Why 4084? Cuz this isn't the first time I've done this - or even in the first twelve times, really. :-) I knew it'd come out to 4096 total pages with index pages included.


SET NOCOUNT ON
DECLARE @string VARCHAR(4096) = 'AA';
DECLARE @ij     INT           = 4084, @jk     INT = 12; 
--stuff @string to 4096 characters
WHILE @jk > 1
      BEGIN
            SELECT @string = STUFF(@string,2,0,@string);
            SELECT @jk = @jk - 1;
      END ;

WHILE @ij > 0 
      BEGIN
            INSERT INTO sq_dashE_1 VALUES (@ij, @string); 
            SELECT @ij=@ij-1;
      END ;

So how does it look? After populating the table, I used the index physical stats function to get fragmentation information.


SET NOCOUNT ON
SELECT object_name(pstats.object_id) AS tname,
       index_type_desc,index_level,
       avg_fragment_size_in_pages,
       fragment_count,page_count,
       avg_fragmentation_in_percent 
FROM   sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,'DETAILED') AS pstats
WHERE  pstats.object_id IN (object_id('sq_dashE_1'))



Aha!  You can see above that there are 4084 leaf pages, and 4084 fragments in the leaf pages.  The average fragment size in pages is 1. It seems maybe populating the ascending key clustered index in reverse order fragmented the table as bad as possible evvaarrr.

Merrill Aldrich made a great tool for inspecting physical page placement of small databases.  You can download version 1.2 of this tool at the following link. It's a wonderful tool for exploration - but not really suited for large databases in its current form.
http://sqlblog.com/blogs/merrill_aldrich/archive/2013/10/09/sql-file-layout-viewer-1-2.aspx

Using this tool, its easy to spotcheck my hunch: Inserting rows in decreasing key value into an ascending clustered index leads to "perfect" or complete fragmentation as reported by avg_fragmentation_in_percent from the sys.dm_db_index_physical_stats function.  The test table sq_dashE_1 is in orange below.  But probably the first thing you notice is sq_dashE_1 pages and extents are quite contiguous. (And nevermind the high page numbers and the other database contents seen at the periphery which may or may not be from post 5 in this series :-})


In the graphic above, notice that the "previous page" for page ID 208088 is page 208090.  And the
"next page" page 208088 (the lowest page ID allocated to sq_dashE_1) is undefined.  (Page ID 208089 is skipped between leaf pages on either side because its an index page at a higher level - it only points to leaves.)

Let's take a look at details provided from function sys.dm_db_database_page_allocations.


SET NOCOUNT ON
SELECT   extent_page_id,
         previous_page_page_Id,
         allocated_page_page_id,
         next_page_page_id,
         is_allocated,
         page_level,
         page_type_desc
FROM     sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID('sq_dashE_1'), 1, NULL, 'DETAILED')
WHERE    page_type_desc<>'IAM_PAGE'
ORDER BY allocated_page_page_id


So in the graphic above, perhaps more nonintuitively than you'd ever imagined, I show the backward but predictable relationship of the logical order of the pages and their physical order.  I'm only showing 2 extents worth of the results, but the pattern continues throughout.

How many 64kb extents does test table sq_dashE_1 span?


SELECT COUNT (DISTINCT extent_page_id) AS number_of_extents_LIMITED
FROM   sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID('sq_dashE_1'), 1, NULL, 'LIMITED');

SELECT COUNT (DISTINCT extent_page_id) AS number_of_extents_DETAILED
FROM   sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID('sq_dashE_1'), 1, NULL, 'DETAILED')
WHERE  page_type_desc<>'IAM_PAGE';


Test table sq_dashE_1 spans 513 extents.  The LIMITED returns of function sys.dm_db_database_page_allocations don't bring back values for page_type_desc to exclude the IAM page :-)

OK.  So the 4084 leaf pages are reported by sys.dm_db_index_physical_stats as 4084 fragments of 1 page each.  The leaf pages are spread across 513 extents.  How many physical reads will it take to do an unordered full scan of the table?  Anyone think 4084, one read per fragment?  How about 513 - reading an entire 64k extent at a time? Let's see.


IF OBJECT_ID('tempdb..#blackhole') IS NOT NULL DROP TABLE #blackhole 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);
SELECT * INTO #blackhole FROM sq_dashE_1 where varchar_4096 like '%b%'
SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1);



87 reads. 33554432 bytes=4096 eight kb pages(4084 leaf pages and index pages).

And what about the report from STATISTICS IO?


Here's something I mentioned elsewhere - still not gonna support my assertion, just gonna drop it here: disk reads to support database pages by read-ahead aren't reported by STATISTICS IO, only the number of database pages read by read-ahead.  So in this case we've seen 87 disk reads from sys.dm_io_virtual_file_stats and STATISTICS IO is reporting 3.  That leaves 84 reads to cover the 4072 pages read by read-ahead.  The average read size across all 87 reads was over 370kb - somewhere between 5 and 6 extents by average.

And if you read ahead in the series to the 5th post, you'll see a similar table, with a measly reported .0245% fragmentation that requires 338 reads for a similar query!!!
I got a hunch about SQL Server Fragmentation, Part V
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server_79.html

So maybe fragmentation isn't all that bad??
Nahh, that's not where I'm going with this at all :-)

But I do plan to show that, depending on goals for index maintenance, index fragmentation evaluation may need to be a lot different than current common methods.

Besides, in post 4 I show a "perfectly defragmented" table of similar structure that accomplishes the same query in 68 reads :-)
I got a hunch about SQL Server Fragmentation, Part IV
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server_14.html

OK, done for today.  Well... ok, I got one more thing.

There are 4096 8k pages in sq_dashE_1.  That'd be 512 extents(eight pages each) if perfectly packed.  But we know from sys.dm_db_database_page_allocations the pages for sq_dashE_1 span 513 extents.  So somewhere there's a blank space.

Your move, Taylor.




No comments:

Post a Comment