#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****
The following work was performed in SQL Server 2014, with the 2014 compatibility mode set for the database. Further testing has shown similar behavior with the SQL Server 2012 compatibility mode, although thresholds for performance divots are different(much lower, actually).
Consider the following simple query:
select * from A join B on A.not_pk = B.not_pk
In tables A and B, the 'not_pk' column is a nonindexed, nullable column.
Even though A.not_pk and B.not_pk may both have NULL values, rows with null values must not join in that query - the join carries an implicit NOT NULL filter for both columns: them's the rules.
That is *usually* how the SQL Server behaves. But I get to work with a lot of outliers :-)
Lets start by creating two tables: a large X_1 table(600000 rows) and a smaller X_2 table(32000 rows). The code below might look a little intimidating, but the tables it creates are fairly simple. My colleague reproduced this situation we observed in the field before I was able to. The code below is actually a significant simplification of how he originally recreated the problem :-) Table X_1 has a monotonically increasing primary key in the first column. In the second column, every odd numbered row has a NULL value. Even numbered rows have row_number MOD 1024.
The X_2 table has positive and negative integers in the first column as primary key. The second column is NULL for the whole table.
Here's the code used to create the tables.
CREATE TABLE X_1 ( PK_X_1 NUMERIC(18,0) NOT NULL, NOT_PK_X_1 NUMERIC(18,0) NULL, PRIMARY KEY (PK_X_1) ) WITH (DATA_COMPRESSION = PAGE); CREATE STATISTICS PK_X_1 ON X_1(PK_X_1); CREATE STATISTICS NOT_PK_X_1 ON X_1(NOT_PK_X_1); ;WITH L1 AS (SELECT 1 AS c UNION ALL SELECT 1), --2 rows L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1 C), --8 rows L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C), --512 rows L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), --262144 rows L5 AS (SELECT 1 AS c FROM L4 A CROSS JOIN L2 B CROSS JOIN L2 C), --8 * 2097152 rows NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L5) INSERT INTO X_1 WITH (TABLOCK) SELECT TOP (600000) NUM, CASE WHEN NUM %2 = 1 THEN NULL ELSE NUM %1024 END FROM NUMS; UPDATE STATISTICS X_1 WITH FULLSCAN; CREATE TABLE X_2 ( PK_X_2 NUMERIC(18,0) NOT NULL, NOT_PK_X_2 NUMERIC(18,0) NULL, PRIMARY KEY (PK_X_2) ) WITH (DATA_COMPRESSION = PAGE); CREATE STATISTICS PK_X_2 ON X_2(PK_X_2); CREATE STATISTICS NOT_PK_X_2 ON X_2(NOT_PK_X_2); ;WITH L1 AS (SELECT 1 AS c UNION ALL SELECT 1), --2 rows L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1), --8 rows L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C), --512 rows L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), --262144 rows NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L4) INSERT INTO X_2 WITH (TABLOCK) SELECT TOP (32000) CASE WHEN NUM %2 = 1 Then ROUND( NUM / 2.0 , 0) ELSE ROUND( 0 - (NUM / 2.0) , 0) END, NULL FROM NUMS; UPDATE STATISTICS X_2 WITH FULLSCAN;
Now for the good stuff :-)
This query looks simple enough.
SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2;
The query above consumed 1513643 ms of CPU time on SQL Server 2014 with 2014 compatibility mode! That's over 25 minutes! And almost all of the work was done by one of the parallel workers, so elapsed time was over 25 minutes as well!
What about the three logically equivalent queries below?
-- SET NOCOUNT ON DECLARE @startCPU BIGINT; DECLARE @endCPU BIGINT; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t1.NOT_PK_X_1 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t2.NOT_PK_X_2 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; -- SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; SELECT t1.PK_X_1 FROM X_1 AS t1 INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2 INNER JOIN X_2 AS t3 ON t2.PK_X_2 = t3.PK_X_2 WHERE t2.NOT_PK_X_2 IS NOT NULL AND t1.NOT_PK_X_1 IS NOT NULL; SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid; PRINT @endCPU - @startCPU; --
Wow!! 178 ms, 2 ms, 2 ms. I repeated several times to eliminate overhead from query plan compile, etc. So the original query took over 25 minutes - but making the implicit "IS NOT NULL" filter explicit cuts the time down to 2 ms!
I believe that's a bug in the optimizer - seems like it should consistently interpret joins of NULLable columns as if the "IS NOT NULL" filter is explicit. I've opened a ticket - lets see if the engineers agree with me :-)
I had something else I wanted to learn from this adventure. These queries leverage hash joins. Many online sources send mixed messages about hash joins - at once lauding their scalability while warning they may not scale well.
Well - what happens to CPU time accumulated when executing this query with different table sizes? I kept X_1 at 600000 rows, and experimented with the number of rows in X_2.
_
The first graph below shows two different linear patterns for CPU time. At 20609 X_2 rows and above, CPU time accumulated according to the steeper pattern. At 17814 X_2 rows and below, CPU time accumulated according to the less steep linear function.
This is a closeup showing the range where the two linear CPU time functions overlap, between 17815 and 20608 X_2 rows.
Here's a table of the values - for me this is sometimes more useful than fancy graphs. So two takeaways:
1. If you join NULLable columns, look out for this issue!
2. If you can't make SQL Server do the right thing - maybe you can make it do the other thing faster? That would depend on catching the lower slope function rather than the steeper function for CPU time.
x_2_rows | cpu_ms |
1000 | 15373 |
2000 | 30535 |
4000 | 59895 |
6000 | 90245 |
8000 | 122635 |
10000 | 150310 |
12000 | 180071 |
14000 | 210166 |
16000 | 247200 |
17000 | 255422 |
17500 | 262884 |
17800 | 268588 |
17801 | 269375 |
17808 | 267344 |
17812 | 269297 |
17813 | 269805 |
17814 | 269802 |
17815 | 845591 |
17826 | 844924 |
17850 | 848245 |
17900 | 849694 |
18000 | 866731 |
18000 | 855672 |
18100 | 856142 |
18102 | 858401 |
18124 | 858765 |
18128 | 857404 |
18129 | 272006 |
18130 | 271808 |
18132 | 273134 |
18142 | 272440 |
18150 | 272994 |
18200 | 272976 |
18300 | 276158 |
18500 | 277696 |
18900 | 283824 |
19000 | 285213 |
20000 | 300301 |
20000 | 300339 |
20100 | 301786 |
20500 | 309394 |
20600 | 311064 |
20606 | 310815 |
20607 | 314863 |
20608 | 308786 |
20609 | 982545 |
20612 | 981406 |
20624 | 981828 |
20650 | 981884 |
20700 | 978802 |
20800 | 992951 |
21000 | 994604 |
22000 | 1043272 |
24000 | 1134272 |
26000 | 1231392 |
28000 | 1327950 |
30000 | 1418509 |
32000 | 1513643 |
Thank you! I've found the same problem and it's good to know I'm not crazy.
ReplyDelete