Tuesday, October 13, 2015

Performance Pitfall!! SQL Server joins on NULLable columns

***** Update *****
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below.  Thanks, Paul!

Hash Joins on Nullable Columns
http://sqlperformance.com/2015/11/sql-plan/hash-joins-on-nullable-columns
*****


The following work was performed in SQL Server 2014, with the 2014 compatibility mode set for the database.  Further testing has shown similar behavior with the SQL Server 2012 compatibility mode, although thresholds for performance divots are different(much lower, actually).

Consider the following simple query:
select * from A join B on A.not_pk = B.not_pk

In tables A and B, the 'not_pk' column is a nonindexed, nullable column.
Even though A.not_pk and B.not_pk may both have NULL values, rows with null values must not join in that query - the join carries an implicit NOT NULL filter for both columns: them's the rules.

That is *usually* how the SQL Server behaves.  But I get to work with a lot of outliers :-)

Lets start by creating two tables: a large X_1 table(600000 rows) and a smaller X_2 table(32000 rows).  The code below might look a little intimidating, but the tables it creates are fairly simple.  My colleague reproduced this situation we observed in the field before I was able to.  The code below is actually a significant simplification of how he originally recreated the problem :-) Table X_1 has a monotonically increasing primary key in the first column.  In the second column, every odd numbered row has a NULL value.  Even numbered rows have row_number MOD 1024.


The X_2 table has positive and negative integers in the first column as primary key.  The second column is NULL for the whole table.



Here's the code used to create the tables.


CREATE TABLE X_1 (
       PK_X_1     NUMERIC(18,0) NOT NULL,
       NOT_PK_X_1 NUMERIC(18,0) NULL,
       PRIMARY KEY (PK_X_1)
)
WITH (DATA_COMPRESSION = PAGE);
CREATE STATISTICS PK_X_1 ON X_1(PK_X_1);
CREATE STATISTICS NOT_PK_X_1 ON X_1(NOT_PK_X_1);

;WITH
       L1 AS (SELECT 1 AS c UNION ALL SELECT 1),                          --2 rows
       L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1 C),   --8 rows
       L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C),   --512 rows
       L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B),                   --262144 rows
       L5 AS (SELECT 1 AS c FROM L4 A CROSS JOIN L2 B CROSS JOIN L2 C),   --8 * 2097152 rows 
       NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L5)
INSERT INTO X_1 WITH (TABLOCK)
SELECT TOP (600000)
       NUM, 
       CASE 
            WHEN NUM %2 = 1 THEN NULL
            ELSE NUM %1024
       END
FROM NUMS;
UPDATE STATISTICS X_1 WITH FULLSCAN;


CREATE TABLE X_2 (
       PK_X_2     NUMERIC(18,0) NOT NULL,
       NOT_PK_X_2 NUMERIC(18,0) NULL,
       PRIMARY KEY (PK_X_2)
)
WITH (DATA_COMPRESSION = PAGE);
CREATE STATISTICS PK_X_2 ON X_2(PK_X_2);
CREATE STATISTICS NOT_PK_X_2 ON X_2(NOT_PK_X_2);

;WITH
       L1 AS (SELECT 1 AS c UNION ALL SELECT 1),                          --2 rows
       L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B CROSS JOIN L1),     --8 rows
       L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B CROSS JOIN L2 C),   --512 rows
       L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B),                   --262144 rows
       NUMS AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS NUM FROM L4)  
INSERT INTO X_2 WITH (TABLOCK)
SELECT TOP (32000)
       CASE
            WHEN NUM %2 = 1 Then ROUND( NUM / 2.0 , 0) 
            ELSE ROUND( 0 - (NUM / 2.0) , 0)
       END,
       NULL
FROM NUMS;
UPDATE STATISTICS X_2 WITH FULLSCAN;



Now for the good stuff :-)
This query looks simple enough.


SELECT 
           t1.PK_X_1
FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2;

The query above consumed 1513643 ms of CPU time on SQL Server 2014 with 2014 compatibility mode!  That's over 25 minutes!  And almost all of the work was done by one of the parallel workers, so elapsed time was over 25 minutes as well!

What about the three logically equivalent queries below?

--
SET NOCOUNT ON
DECLARE @startCPU BIGINT;
DECLARE @endCPU   BIGINT;
--
SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;

SELECT 
           t1.PK_X_1
FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2
WHERE t1.NOT_PK_X_1 IS NOT NULL;

SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;
PRINT @endCPU - @startCPU;

--
SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;

SELECT 
           t1.PK_X_1
FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2
WHERE t2.NOT_PK_X_2 IS NOT NULL;

SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;
PRINT @endCPU - @startCPU;

--
SELECT @startCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;

SELECT 
           t1.PK_X_1
FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2
WHERE t2.NOT_PK_X_2 IS NOT NULL
  AND t1.NOT_PK_X_1 IS NOT NULL;

SELECT @endCPU = cpu_time FROM sys.dm_exec_requests WHERE session_id=@@spid;
PRINT @endCPU - @startCPU;
--


Wow!! 178 ms, 2 ms, 2 ms.  I repeated several times to eliminate overhead from query plan compile, etc.  So the original query took over 25 minutes - but making the implicit "IS NOT NULL" filter explicit cuts the time down to 2 ms!

I believe that's a bug in the optimizer - seems like it should consistently interpret joins of NULLable columns as if the "IS NOT NULL" filter is explicit.  I've opened a ticket - lets see if the engineers agree with me :-)

I had something else I wanted to learn from this adventure.  These queries leverage hash joins.  Many online sources send mixed messages about hash joins - at once lauding their scalability while warning they may not scale well.

Well - what happens to CPU time accumulated when executing this query with different table sizes?  I kept X_1 at 600000 rows, and experimented with the number of rows in X_2.
_
The first graph below shows two different linear patterns for CPU time. At 20609 X_2 rows and above, CPU time accumulated according to the steeper pattern. At 17814 X_2 rows and below, CPU time accumulated according to the less steep linear function.


This is a closeup showing the range where the two linear CPU time functions overlap, between 17815 and 20608 X_2 rows.


Here's a table of the values - for me this is sometimes more useful than fancy graphs.  So two takeaways: 
1. If you join NULLable columns, look out for this issue!
2. If you can't make SQL Server do the right thing - maybe you can make it do the other thing faster? That would depend on catching the lower slope function rather than the steeper function for CPU time.

x_2_rows    cpu_ms
1000 15373
2000 30535
4000 59895
6000 90245
8000 122635
10000 150310
12000 180071
14000 210166
16000 247200
17000 255422
17500 262884
17800 268588
17801 269375
17808 267344
17812 269297
17813 269805
17814 269802
17815 845591
17826 844924
17850 848245
17900 849694
18000 866731
18000 855672
18100 856142
18102 858401
18124 858765
18128 857404
18129 272006
18130 271808
18132 273134
18142 272440
18150 272994
18200 272976
18300 276158
18500 277696
18900 283824
19000 285213
20000 300301
20000 300339
20100 301786
20500 309394
20600 311064
20606 310815
20607 314863
20608 308786
20609 982545
20612 981406
20624 981828
20650 981884
20700 978802
20800 992951
21000 994604
22000 1043272
24000 1134272
26000 1231392
28000 1327950
30000 1418509
32000 1513643

1 comment:

  1. Thank you! I've found the same problem and it's good to know I'm not crazy.

    ReplyDelete