Previous musing here.
I got a hunch about SQL Server fragmentation Part I
http://sql-sasquatch.blogspot.com/2015/08/i-got-hunch-about-sql-server.html
And if you are hungering for more after this, please proceed to
I got a hunch about SQL Server fragmentation Part III
http://sql-sasquatch.blogspot.com/2015/09/i-got-hunch-about-sql-server_13.html
I'm working in a newly created test database, which has a single database file.
I'll start by creating my favorite (for now) type of table - two columns with one an integer clustered index key, and the second a 4096 character VARCHAR. Because the clustered index is not compressed, each row in this table will be 1 page.
CREATE TABLE [dbo].[sq_CI] ([ID_INT] [int] NOT NULL, [VARCHAR_4096] [VARCHAR](4096) NULL CONSTRAINT [PK_sq_CI] PRIMARY KEY CLUSTERED([ID_INT] ASC) ON [PRIMARY] ) ON [PRIMARY];
Important to disclose that in the SQL Server instance for my testing, trace flag 1118 is enabled globally.
You can read more about trace flag 1118 in Paul Randal's post below.
Misconceptions around TF 1118
http://www.sqlskills.com/blogs/paul/misconceptions-around-tf-1118/
Lets populate the table, and grab fragmentation information.
DECLARE @string VARCHAR(4096) = 'AA', @maxVal INT = 7; DECLARE @ij INT = 1 , @jk INT = 12; --stuff @string to 4096 characters WHILE @jk > 1 BEGIN SELECT @string = STUFF(@string,2,0,@string); SELECT @jk = @jk - 1; END ; WHILE @ij < @maxVal+1 BEGIN INSERT INTO sq_CI VALUES (@ij, @string); SELECT @ij=@ij+1; END ; SELECT object_name(pstats.object_id) AS tname, index_type_desc,index_level, avg_fragment_size_in_pages, fragment_count,page_count, avg_fragmentation_in_percent FROM sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,'DETAILED') AS pstats WHERE pstats.object_id IN (object_id('sq_CI'))
The result:
Huh. 14% fragmentation, 2 fragments. I thought it'd be better than that.
What happens if we checkpoint(to write out any dirty database pages), dropcleanbuffers and do a full tablescan?
CHECKPOINT; DBCC DROPCLEANBUFFERS; SET STATISTICS IO ON; SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1); SELECT * FROM sq_CI where varchar_4096 like '%b%' SELECT num_of_reads,num_of_bytes_read from sys.dm_io_virtual_file_stats(DB_ID(),1); SET STATISTICS IO OFF;
1 Read. 65536 bytes. That's one extent. How can two fragments be read in one 64kb read?
Maybe sys.dm_db_database_page_allocations can help clear the matter up?
This information makes it clear that all of the DATA_PAGE pages are in the same 64kb extent.
In fact, the physical order of DATA_PAGE pages is in agreement with the logical order of DATA_PAGE pages:
78168=>78170=>78171=>78172=>78173=>78174=>78175
So the ordering of the clustered index DATA_PAGE pages is good - in fact it can't be made any better. (Dare ya to try :-})
All of DATA_PAGE pages can be retrieved in a single disk read.
So why does sys.dm_db_index_physical_stats report over 14% fragmentation?
Why does sys.dm_db_index_physical_stats report 2 fragments, when all pages can be retrieved in one read?
I suspect an algorithm error. My guess is that the algorithm is looking at the next logical DATA_PAGE for each DATA_PAGE, and comparing to the next ALLOCATED_PAGE_PAGE_ID that IS_ALLOCATED for the Index Id. By that algorithm, the placement of INDEX_PAGE 78169 divides the 7 DATA_PAGE pages incorrectly into two "fragments".
Instead the algorithm should look at the next logical DATA_PAGE for each DATA_PAGE, and compare to the next ALLOCATED_PAGE_PAGE_ID that IS_ALLOCATED for the Index Id for DATA_PAGE page_type_desc.
That's a subtle distinction that would make ~14% "fragmentation" magically disappear in this case.
That's it for now I guess...
No comments:
Post a Comment