My list of interest in SQL Serer 2016 RTM CU6. Especially valuable because even though SQL Server 2014 SP1 was made available slightly after RTM CU6, it did not include all of these fixes at the time. This is because SP1 was built on a base of SQL Server 2014 RTM CU5.
12.00.248 Cumulative update package 6 (CU6) for SQL Server 2014 February 16, 2015
https://support.microsoft.com/en-us/kb/3031047
*****
-----
Large query compilation waits on RESOURCE_SEMAPHORE_QUERY_COMPILE in SQL Server 2014
https://support.microsoft.com/en-us/kb/3024815
FIX: SOS_CACHESTORE spinlock contention on system table rowset cache causes high CPU usage in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3026082
FIX: SOS_CACHESTORE spinlock contention on ad hoc SQL Server plan cache causes high CPU usage in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3026083
Error 17066 or 17310 during SQL Server startup
https://support.microsoft.com/en-us/kb/3027860
FIX: Access violation occurs when you delete rows from a table that has clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3029762
FIX: OS error 665 when you execute DBCC CHECKDB command for database that contains columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3029977
FIX: Memory leak occurs when you run DBCC CHECKDB against a database in SQL Server 2014
https://support.microsoft.com/en-us/kb/3034615
FIX: Error 8646 when you run DML statements on a table with clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3035165
Monday, October 26, 2015
LIST - SQL Server 2014 SP1 CU1
My list of interest for SQL Server 2014 SP1 CU1 fixes.
*****
Cumulative update package 1 for SQL Server 2014 SP1
Build number: 12.0.4416.0
June 22, 2015
https://support.microsoft.com/en-us/kb/3067839
-----
FIX: Complex parallel query does not respond in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3037624
FIX: Access violation occurs when you connect to an instance of SQL Server 2014
https://support.microsoft.com/en-us/kb/3067828
Large query compilation waits on RESOURCE_SEMAPHORE_QUERY_COMPILE in SQL Server 2014
https://support.microsoft.com/en-us/kb/3024815
Trace flag 6498
FIX: SOS_CACHESTORE spinlock contention on ad hoc SQL Server plan cache causes high CPU usage in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3026083
FIX: Incorrect choice of a nested loops join strategy causes poor query performance in SQL Server 2014
https://support.microsoft.com/en-us/kb/3042544
CCI Memory management fix from regression introduced in SQL Server 2014 SP1 CU0/RTM CU6 kb3053664
https://support.microsoft.com/en-us/kb/3053664
"Non-yielding Scheduler" error occurs and SQL Server stops responding when a database has columnstore indexes on a Microsoft SQL Server 2014 instance
https://support.microsoft.com/en-us/kb/3069488
FIX: Access violation occurs when you delete rows from a table that has clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3029762
FIX: Paging out in memory occurs when columnstore index query consumes large memory in SQL Server 2014
https://support.microsoft.com/en-us/kb/3067968
FIX: Severe error in SQL Server 2014 during compilation of a query on a table with clustered columnstore index
https://support.microsoft.com/en-us/kb/3068297
FIX: Error 1205 when you execute parallel query that contains outer join operators in SQL Server 2014
https://support.microsoft.com/en-us/kb/3052167
FIX: Poor performance when a query contains table joins in SQL Server 2014
https://support.microsoft.com/en-us/kb/2999809
FIX: Access violation when you query against a table that contains column store indexes in SQL Server 2014
https://support.microsoft.com/en-us/kb/3064784
FIX: Error 33294 occurs when you alter column types on a table that has clustered columnstore indexes in SQL Server 2014
https://support.microsoft.com/en-us/kb/3070139
FIX: Partial results in a query of a clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3067257
*****
Cumulative update package 1 for SQL Server 2014 SP1
Build number: 12.0.4416.0
June 22, 2015
https://support.microsoft.com/en-us/kb/3067839
-----
FIX: Complex parallel query does not respond in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3037624
FIX: Access violation occurs when you connect to an instance of SQL Server 2014
https://support.microsoft.com/en-us/kb/3067828
Large query compilation waits on RESOURCE_SEMAPHORE_QUERY_COMPILE in SQL Server 2014
https://support.microsoft.com/en-us/kb/3024815
Trace flag 6498
FIX: SOS_CACHESTORE spinlock contention on ad hoc SQL Server plan cache causes high CPU usage in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3026083
FIX: Incorrect choice of a nested loops join strategy causes poor query performance in SQL Server 2014
https://support.microsoft.com/en-us/kb/3042544
CCI Memory management fix from regression introduced in SQL Server 2014 SP1 CU0/RTM CU6 kb3053664
https://support.microsoft.com/en-us/kb/3053664
"Non-yielding Scheduler" error occurs and SQL Server stops responding when a database has columnstore indexes on a Microsoft SQL Server 2014 instance
https://support.microsoft.com/en-us/kb/3069488
FIX: Access violation occurs when you delete rows from a table that has clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3029762
FIX: Paging out in memory occurs when columnstore index query consumes large memory in SQL Server 2014
https://support.microsoft.com/en-us/kb/3067968
FIX: Severe error in SQL Server 2014 during compilation of a query on a table with clustered columnstore index
https://support.microsoft.com/en-us/kb/3068297
FIX: Error 1205 when you execute parallel query that contains outer join operators in SQL Server 2014
https://support.microsoft.com/en-us/kb/3052167
FIX: Poor performance when a query contains table joins in SQL Server 2014
https://support.microsoft.com/en-us/kb/2999809
FIX: Access violation when you query against a table that contains column store indexes in SQL Server 2014
https://support.microsoft.com/en-us/kb/3064784
FIX: Error 33294 occurs when you alter column types on a table that has clustered columnstore indexes in SQL Server 2014
https://support.microsoft.com/en-us/kb/3070139
FIX: Partial results in a query of a clustered columnstore index in SQL Server 2014
https://support.microsoft.com/en-us/kb/3067257
LIST - SQL Server 2014 SP1 CU2
Here's my list of fixes on interest for SQL Server 2014 SP1 CU2.
Cumulative update 2 for SQL Server 2014 SP1
Build number: 12.0.4422.0
August 17, 2015
https://support.microsoft.com/en-us/kb/3075950
-----
FIX: CMEMTHREAD waits occur when you execute many ad hoc queries in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3074425
FIX: Incorrect results occur in a rare scenario when you run a query that contains a nested loop join and performance spool in its inner side in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3068776
FIX: The bcp utility may crash when you use SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3054504
Cumulative update 2 for SQL Server 2014 SP1
Build number: 12.0.4422.0
August 17, 2015
https://support.microsoft.com/en-us/kb/3075950
-----
FIX: CMEMTHREAD waits occur when you execute many ad hoc queries in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3074425
FIX: Incorrect results occur in a rare scenario when you run a query that contains a nested loop join and performance spool in its inner side in SQL Server 2012 or 2014
https://support.microsoft.com/en-us/kb/3068776
FIX: The bcp utility may crash when you use SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3054504
LIST - SQL Server 2014 SP1 CU3
You probably won't learn anything from this post. Gonna try to prefix titles of such posts with "LIST" from now on :-) This post is just a list of kbs I am especially interested in.
*****
Build 12.00.4427, SQL Server 2014 SP1 CU3 became available October 21, 2015.
Cumulative Update 3 for SQL Server 2014 Service Pack 1
https://support.microsoft.com/en-us/kb/3094221
These are the fixes that I have a particular interest in.
FIX: High CPU usage after you reduce the value of the CAP_CPU_PERCENT argument in creating a resource pool
https://support.microsoft.com/en-us/kb/3097073
FIX: Scheduler deadlock on primary replica when you remove a replica from an AlwaysOn availability group in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3088307
FIX: Queries take longer time to compile in SQL Server 2014 than in earlier versions
https://support.microsoft.com/en-us/kb/3097114
FIX: DBCC CHECKFILEGROUP reports false inconsistency errors on database that has partitioned table in SQL Server 2014
https://support.microsoft.com/en-us/kb/3099323
FIX: Sort operator spills to tempdb in SQL Server 2012 or SQL Server 2014 when estimated number of rows and row size are correct
https://support.microsoft.com/en-us/kb/3088480
Trace flag 7470
FIX: Rare index corruption when you build a columnstore index with parallelism on a partitioned table in SQL Server 2014
https://support.microsoft.com/en-us/kb/3080155
FIX: Access violation occurs when you execute a query that contains many joins and coalesce functions in SQL Server 2014
https://support.microsoft.com/en-us/kb/3082860
FIX: DBCC CHECKDB doesn't report metadata inconsistency of invalid filegroups in its summary in SQL Server 2014
https://support.microsoft.com/en-us/kb/3100361
FIX: It takes a longer time than earlier versions of SQL Server when you run DBCC CHECKDB in SQL Server 2012
https://support.microsoft.com/en-us/kb/3092702
*****
Build 12.00.4427, SQL Server 2014 SP1 CU3 became available October 21, 2015.
Cumulative Update 3 for SQL Server 2014 Service Pack 1
https://support.microsoft.com/en-us/kb/3094221
These are the fixes that I have a particular interest in.
FIX: High CPU usage after you reduce the value of the CAP_CPU_PERCENT argument in creating a resource pool
https://support.microsoft.com/en-us/kb/3097073
FIX: Scheduler deadlock on primary replica when you remove a replica from an AlwaysOn availability group in SQL Server 2012 or SQL Server 2014
https://support.microsoft.com/en-us/kb/3088307
FIX: Queries take longer time to compile in SQL Server 2014 than in earlier versions
https://support.microsoft.com/en-us/kb/3097114
FIX: DBCC CHECKFILEGROUP reports false inconsistency errors on database that has partitioned table in SQL Server 2014
https://support.microsoft.com/en-us/kb/3099323
FIX: Sort operator spills to tempdb in SQL Server 2012 or SQL Server 2014 when estimated number of rows and row size are correct
https://support.microsoft.com/en-us/kb/3088480
Trace flag 7470
FIX: Rare index corruption when you build a columnstore index with parallelism on a partitioned table in SQL Server 2014
https://support.microsoft.com/en-us/kb/3080155
FIX: Access violation occurs when you execute a query that contains many joins and coalesce functions in SQL Server 2014
https://support.microsoft.com/en-us/kb/3082860
FIX: DBCC CHECKDB doesn't report metadata inconsistency of invalid filegroups in its summary in SQL Server 2014
https://support.microsoft.com/en-us/kb/3100361
FIX: It takes a longer time than earlier versions of SQL Server when you run DBCC CHECKDB in SQL Server 2012
https://support.microsoft.com/en-us/kb/3092702
Thursday, October 15, 2015
Performance Pitfall - Act III: the Shyamalan twist!!
***** Update *****
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The last few days I've provided some details of my recent testing of SQL Server joins on NULLable columns.
http://sql-sasquatch.blogspot.com/2015/10/sql-server-hash-joins-on-nullable.html
http://sql-sasquatch.blogspot.com/2015/10/performance-pitfall-part-deux-sql.html
Time for a surprise twist! You'll remember that I am creating a table X_1 that looks like this:
x
The primary key in the first column is an increasing integer, step 1. I'm using 2012 compatibility mode today, so rather than the 600,000 rows I used with 2014 compatibility mode I have just 72048 rows.
Still going to use 32,000 rows in my X_2 table. But here's my gimmick for today. In the 32,000 rows of table X_2, I'm going to vary the number of NULL values. No big trick to doing that - just use a CASE statement with a MOD operator to assign value of 1 (or NULL) when MOD N is 0, ELSE NULL (or 1, as the case may be to get the desired count of NULL values).
In my first post on this investigation you'll see the basic code I'm using to populate these tables.
Recall that the almost-innocent-looking query I am testing is this:
So - what happens to execution time of the query as I change the number and distribution of NULL values throughout the X_2 table?
I'm glad you asked.
As the number of NULL values in the second column increased (while maintaining a total row count of 32,000), query CPU_ms dropped in a fairly linear fashion. Makes sense - as the number of NULL values increase, the join has less work to do - because NULL values can't join unless you do something tricky.
Ready for the Shyamalan twist?
Maybe it looks like I went all the way to 32,000 rows in the graph above. I didn't. Let me dramatically zoom out to catch what happened at 32,000 NULL values and ZERO non-NULL values.
x
Wow. OK. Maybe that's not "Sixth Sense" level twist. How about "Lady in the Lake"? :-)
So... SQL Server isn't treating the query the same without the implicit NULL exclusion that accompanies the join on the NULLable columns. But it doesn't REALLY cause a problem unless the column has only NULL values. And then, only after a certain number of rows in the table. And I haven't been able to tease out the problem unless I've got at least two joins.
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The last few days I've provided some details of my recent testing of SQL Server joins on NULLable columns.
http://sql-sasquatch.blogspot.com/2015/10/sql-server-hash-joins-on-nullable.html
http://sql-sasquatch.blogspot.com/2015/10/performance-pitfall-part-deux-sql.html
Time for a surprise twist! You'll remember that I am creating a table X_1 that looks like this:
x
The primary key in the first column is an increasing integer, step 1. I'm using 2012 compatibility mode today, so rather than the 600,000 rows I used with 2014 compatibility mode I have just 72048 rows.
Still going to use 32,000 rows in my X_2 table. But here's my gimmick for today. In the 32,000 rows of table X_2, I'm going to vary the number of NULL values. No big trick to doing that - just use a CASE statement with a MOD operator to assign value of 1 (or NULL) when MOD N is 0, ELSE NULL (or 1, as the case may be to get the desired count of NULL values).
In my first post on this investigation you'll see the basic code I'm using to populate these tables.
Recall that the almost-innocent-looking query I am testing is this:
SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2
So - what happens to execution time of the query as I change the number and distribution of NULL values throughout the X_2 table?
I'm glad you asked.
As the number of NULL values in the second column increased (while maintaining a total row count of 32,000), query CPU_ms dropped in a fairly linear fashion. Makes sense - as the number of NULL values increase, the join has less work to do - because NULL values can't join unless you do something tricky.
Ready for the Shyamalan twist?
Maybe it looks like I went all the way to 32,000 rows in the graph above. I didn't. Let me dramatically zoom out to catch what happened at 32,000 NULL values and ZERO non-NULL values.
x
Wow. OK. Maybe that's not "Sixth Sense" level twist. How about "Lady in the Lake"? :-)
So... SQL Server isn't treating the query the same without the implicit NULL exclusion that accompanies the join on the NULLable columns. But it doesn't REALLY cause a problem unless the column has only NULL values. And then, only after a certain number of rows in the table. And I haven't been able to tease out the problem unless I've got at least two joins.
Wednesday, October 14, 2015
Performance Pitfall - Part Deux!! SQL Server 2014 joins on NULLable columns - with 2012 compatibility mode
***** Update *****
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The following post is a companion to yesterday's post. For ease of reference:
I maintain that the basic problem is one of algebra - although the joins by definition excluded NULL values in the join columns, for some reason SQL Server treats the query differently when the NULL exclusions are made explicit. I'll get to that later.
First I want to turn to the other area of interest - predicting scaling behavior of this query when it does the wrong thing. Does this query with 2012 compatibility mode scale along two different linear patterns, as was seen in SQL Server 2014?
Yesterday I used an X_1 table of 600,000 rows and an X_2 table of up to 32,000 rows to demonstrate the two linear scaling functions for the query as X_1 maintained constant row count and X_2 rowcount varied.
I ain't getting any younger, so today I used an X_1 table of 72,048 rows and an X_2 table of up to 32,000 rows for the same purpose with 2012 compatibility mode. (Many of those queries in yesterday's post required over 25 minutes to complete! Too long to watch a hash join complete, I say!!) I'll leave determining why I used exactly 72,048 rows as an exercise for the reader. :-)
So... here's how CPU_ms varied with X_2 rowcount against an X_1 table with 72,048 rows.
And... here's a close-up of that very interesting area where there's a switch between the two linear functions.
So... some pics of the plans are coming. And other stuff. But I wanted to get this posted before I doze off for the night.
Here's a reminder. Yeah, the scaling of the query is interesting - I want to do additional work in the future to see if this type of pattern shows up in hash joins of non-null values. But... I still think in this case I am describing the scaling of SQL Server doing the wrong thing.
Because this query...
... with 72,048 X_1 rows and 32,000 X_2 rows requires only 1 millisecond of CPU time. Rather than 181641 ms of CPU time. That's a pretty hefty performance return for making an implicit filter explicit.
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The following post is a companion to yesterday's post. For ease of reference:
Performance Pitfall!! SQL Server hash joins on nullable columns
Yesterday's post was work done on SQL Server 2014 with 2014 compatibility mode. But that's not where the work actually started. Today I'll post about similar work on SQL Server 2014 with 2012 compatibility mode. Ok, ok... that's also not where the work started. But its as far back as I'll go because I'm still about 12 blog posts behind :-)
In yesterday's blog post I introduce my X_1 and X_2 testing tables. In column two of each of these tables, NULL values are extremely frequent by design to reproduce this performance berserker.
Recall the (in my opinion) innocent-looking query that caused trouble yesterday:
Recall the (in my opinion) innocent-looking query that caused trouble yesterday:
SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2;
I maintain that the basic problem is one of algebra - although the joins by definition excluded NULL values in the join columns, for some reason SQL Server treats the query differently when the NULL exclusions are made explicit. I'll get to that later.
First I want to turn to the other area of interest - predicting scaling behavior of this query when it does the wrong thing. Does this query with 2012 compatibility mode scale along two different linear patterns, as was seen in SQL Server 2014?
Yesterday I used an X_1 table of 600,000 rows and an X_2 table of up to 32,000 rows to demonstrate the two linear scaling functions for the query as X_1 maintained constant row count and X_2 rowcount varied.
I ain't getting any younger, so today I used an X_1 table of 72,048 rows and an X_2 table of up to 32,000 rows for the same purpose with 2012 compatibility mode. (Many of those queries in yesterday's post required over 25 minutes to complete! Too long to watch a hash join complete, I say!!) I'll leave determining why I used exactly 72,048 rows as an exercise for the reader. :-)
So... here's how CPU_ms varied with X_2 rowcount against an X_1 table with 72,048 rows.
And... here's a close-up of that very interesting area where there's a switch between the two linear functions.
So... some pics of the plans are coming. And other stuff. But I wanted to get this posted before I doze off for the night.
Here's a reminder. Yeah, the scaling of the query is interesting - I want to do additional work in the future to see if this type of pattern shows up in hash joins of non-null values. But... I still think in this case I am describing the scaling of SQL Server doing the wrong thing.
Because this query...
SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t2.NOT_PK_X_2 IS NOT NULL;
Tuesday, October 13, 2015
Performance Pitfall!! SQL Server joins on NULLable columns
***** Update *****
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The following work was performed in SQL Server 2014, with the 2014 compatibility mode set for the database. Further testing has shown similar behavior with the SQL Server 2012 compatibility mode, although thresholds for performance divots are different(much lower, actually).
Consider the following simple query:
select * from A join B on A.not_pk = B.not_pk
In tables A and B, the 'not_pk' column is a nonindexed, nullable column.
Even though A.not_pk and B.not_pk may both have NULL values, rows with null values must not join in that query - the join carries an implicit NOT NULL filter for both columns: them's the rules.
That is *usually* how the SQL Server behaves. But I get to work with a lot of outliers :-)
Lets start by creating two tables: a large X_1 table(600000 rows) and a smaller X_2 table(32000 rows). The code below might look a little intimidating, but the tables it creates are fairly simple. My colleague reproduced this situation we observed in the field before I was able to. The code below is actually a significant simplification of how he originally recreated the problem :-) Table X_1 has a monotonically increasing primary key in the first column. In the second column, every odd numbered row has a NULL value. Even numbered rows have row_number MOD 1024.
The X_2 table has positive and negative integers in the first column as primary key. The second column is NULL for the whole table.
Here's the code used to create the tables.
Now for the good stuff :-)
This query looks simple enough.
The query above consumed 1513643 ms of CPU time on SQL Server 2014 with 2014 compatibility mode! That's over 25 minutes! And almost all of the work was done by one of the parallel workers, so elapsed time was over 25 minutes as well!
What about the three logically equivalent queries below?
Wow!! 178 ms, 2 ms, 2 ms. I repeated several times to eliminate overhead from query plan compile, etc. So the original query took over 25 minutes - but making the implicit "IS NOT NULL" filter explicit cuts the time down to 2 ms!
I believe that's a bug in the optimizer - seems like it should consistently interpret joins of NULLable columns as if the "IS NOT NULL" filter is explicit. I've opened a ticket - lets see if the engineers agree with me :-)
I had something else I wanted to learn from this adventure. These queries leverage hash joins. Many online sources send mixed messages about hash joins - at once lauding their scalability while warning they may not scale well.
Well - what happens to CPU time accumulated when executing this query with different table sizes? I kept X_1 at 600000 rows, and experimented with the number of rows in X_2.
_
The first graph below shows two different linear patterns for CPU time. At 20609 X_2 rows and above, CPU time accumulated according to the steeper pattern. At 17814 X_2 rows and below, CPU time accumulated according to the less steep linear function.
This is a closeup showing the range where the two linear CPU time functions overlap, between 17815 and 20608 X_2 rows.
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The following work was performed in SQL Server 2014, with the 2014 compatibility mode set for the database. Further testing has shown similar behavior with the SQL Server 2012 compatibility mode, although thresholds for performance divots are different(much lower, actually).
Consider the following simple query:
select * from A join B on A.not_pk = B.not_pk
In tables A and B, the 'not_pk' column is a nonindexed, nullable column.
Even though A.not_pk and B.not_pk may both have NULL values, rows with null values must not join in that query - the join carries an implicit NOT NULL filter for both columns: them's the rules.
That is *usually* how the SQL Server behaves. But I get to work with a lot of outliers :-)
Lets start by creating two tables: a large X_1 table(600000 rows) and a smaller X_2 table(32000 rows). The code below might look a little intimidating, but the tables it creates are fairly simple. My colleague reproduced this situation we observed in the field before I was able to. The code below is actually a significant simplification of how he originally recreated the problem :-) Table X_1 has a monotonically increasing primary key in the first column. In the second column, every odd numbered row has a NULL value. Even numbered rows have row_number MOD 1024.
The X_2 table has positive and negative integers in the first column as primary key. The second column is NULL for the whole table.
Here's the code used to create the tables.
CREATE TABLE X_1 ( PK_X_1 NUMERIC(18,0) NOT NULL, NOT_PK_X_1 NUMERIC(18,0) NULL, PRIMARY KEY (PK_X_1) ) WITH (DATA_COMPRESSION = PAGE); CREATE STATISTICS PK_X_1 ON X_1(PK_X_1); CREATE STATISTICS NOT_PK_X_1 ON X_1(NOT_PK_X_1); ;WITH L1 AS (SELECT 1 AS c UNION ALL SELECT 1), --2 rows L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1 C), --8 rows L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C), --512 rows L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), --262144 rows L5 AS (SELECT 1 AS c FROM L4 A CROSS JOIN L2 B CROSS JOIN L2 C), --8 * 2097152 rows NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L5) INSERT INTO X_1 WITH (TABLOCK) SELECT TOP (600000) NUM, CASE WHEN NUM %2 = 1 THEN NULL ELSE NUM %1024 END FROM NUMS; UPDATE STATISTICS X_1 WITH FULLSCAN; CREATE TABLE X_2 ( PK_X_2 NUMERIC(18,0) NOT NULL, NOT_PK_X_2 NUMERIC(18,0) NULL, PRIMARY KEY (PK_X_2) ) WITH (DATA_COMPRESSION = PAGE); CREATE STATISTICS PK_X_2 ON X_2(PK_X_2); CREATE STATISTICS NOT_PK_X_2 ON X_2(NOT_PK_X_2); ;WITH L1 AS (SELECT 1 AS c UNION ALL SELECT 1), --2 rows L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1), --8 rows L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C), --512 rows L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), --262144 rows NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L4) INSERT INTO X_2 WITH (TABLOCK) SELECT TOP (32000) CASE WHEN NUM %2 = 1 Then ROUND( NUM / 2.0 , 0) ELSE ROUND( 0 - (NUM / 2.0) , 0) END, NULL FROM NUMS; UPDATE STATISTICS X_2 WITH FULLSCAN;
Now for the good stuff :-)
This query looks simple enough.
SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2;
The query above consumed 1513643 ms of CPU time on SQL Server 2014 with 2014 compatibility mode! That's over 25 minutes! And almost all of the work was done by one of the parallel workers, so elapsed time was over 25 minutes as well!
What about the three logically equivalent queries below?
-- SET NOCOUNT ON DECLARE @startCPU BIGINT; DECLARE @endCPU BIGINT; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t1.NOT_PK_X_1 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t2.NOT_PK_X_2 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t2.NOT_PK_X_2 IS NOT NULL AND t1.NOT_PK_X_1 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; --
Wow!! 178 ms, 2 ms, 2 ms. I repeated several times to eliminate overhead from query plan compile, etc. So the original query took over 25 minutes - but making the implicit "IS NOT NULL" filter explicit cuts the time down to 2 ms!
I believe that's a bug in the optimizer - seems like it should consistently interpret joins of NULLable columns as if the "IS NOT NULL" filter is explicit. I've opened a ticket - lets see if the engineers agree with me :-)
I had something else I wanted to learn from this adventure. These queries leverage hash joins. Many online sources send mixed messages about hash joins - at once lauding their scalability while warning they may not scale well.
Well - what happens to CPU time accumulated when executing this query with different table sizes? I kept X_1 at 600000 rows, and experimented with the number of rows in X_2.
_
The first graph below shows two different linear patterns for CPU time. At 20609 X_2 rows and above, CPU time accumulated according to the steeper pattern. At 17814 X_2 rows and below, CPU time accumulated according to the less steep linear function.
This is a closeup showing the range where the two linear CPU time functions overlap, between 17815 and 20608 X_2 rows.
Here's a table of the values - for me this is sometimes more useful than fancy graphs. So two takeaways:
1. If you join NULLable columns, look out for this issue!
2. If you can't make SQL Server do the right thing - maybe you can make it do the other thing faster? That would depend on catching the lower slope function rather than the steeper function for CPU time.
x_2_rows | cpu_ms |
1000 | 15373 |
2000 | 30535 |
4000 | 59895 |
6000 | 90245 |
8000 | 122635 |
10000 | 150310 |
12000 | 180071 |
14000 | 210166 |
16000 | 247200 |
17000 | 255422 |
17500 | 262884 |
17800 | 268588 |
17801 | 269375 |
17808 | 267344 |
17812 | 269297 |
17813 | 269805 |
17814 | 269802 |
17815 | 845591 |
17826 | 844924 |
17850 | 848245 |
17900 | 849694 |
18000 | 866731 |
18000 | 855672 |
18100 | 856142 |
18102 | 858401 |
18124 | 858765 |
18128 | 857404 |
18129 | 272006 |
18130 | 271808 |
18132 | 273134 |
18142 | 272440 |
18150 | 272994 |
18200 | 272976 |
18300 | 276158 |
18500 | 277696 |
18900 | 283824 |
19000 | 285213 |
20000 | 300301 |
20000 | 300339 |
20100 | 301786 |
20500 | 309394 |
20600 | 311064 |
20606 | 310815 |
20607 | 314863 |
20608 | 308786 |
20609 | 982545 |
20612 | 981406 |
20624 | 981828 |
20650 | 981884 |
20700 | 978802 |
20800 | 992951 |
21000 | 994604 |
22000 | 1043272 |
24000 | 1134272 |
26000 | 1231392 |
28000 | 1327950 |
30000 | 1418509 |
32000 | 1513643 |
Saturday, October 3, 2015
Diskspd.exe - Msg when IFI not available
I'm doing some work with Microsoft's relatively new DiskSpd.exe.
You can find this tool and documentation at the following link.
Diskspd, a Robust Storage Testing Tool, Now Publically Available
https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223
This will be one of my shortest blog posts ever :-)
Just wanted to leave a note that, if you use the -c option to create a test file and IFI (instant file initialization) is not available for the account running DiskSpd, you'll see this message:
Error adjusting token privileges for SeManageVolumePrivilege (error code: 1300)
WARNING: Could not set privileges for setting valid file size; will use a slower method of preparing the file
I like messages that make the situation clear :-)
Ciao!
You can find this tool and documentation at the following link.
Diskspd, a Robust Storage Testing Tool, Now Publically Available
https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223
This will be one of my shortest blog posts ever :-)
Just wanted to leave a note that, if you use the -c option to create a test file and IFI (instant file initialization) is not available for the account running DiskSpd, you'll see this message:
Error adjusting token privileges for SeManageVolumePrivilege (error code: 1300)
WARNING: Could not set privileges for setting valid file size; will use a slower method of preparing the file
I like messages that make the situation clear :-)
Ciao!
Subscribe to:
Posts (Atom)