Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below. Thanks, Paul!
Hash Joins on Nullable Columns
The following work was performed in SQL Server 2014, with the 2014 compatibility mode set for the database. Further testing has shown similar behavior with the SQL Server 2012 compatibility mode, although thresholds for performance divots are different(much lower, actually).
Consider the following simple query:
select * from A join B on A.not_pk = B.not_pk
In tables A and B, the 'not_pk' column is a nonindexed, nullable column.
That is *usually* how the SQL Server behaves. But I get to work with a lot of outliers :-)
Lets start by creating two tables: a large X_1 table(600000 rows) and a smaller X_2 table(32000 rows). The code below might look a little intimidating, but the tables it creates are fairly simple. My colleague reproduced this situation we observed in the field before I was able to. The code below is actually a significant simplification of how he originally recreated the problem :-) Table X_1 has a monotonically increasing primary key in the first column. In the second column, every odd numbered row has a NULL value. Even numbered rows have row_number MOD 1024.
The X_2 table has positive and negative integers in the first column as primary key. The second column is NULL for the whole table.
Here's the code used to create the tables.
Now for the good stuff :-)
This query looks simple enough.
The query above consumed 1513643 ms of CPU time on SQL Server 2014 with 2014 compatibility mode! That's over 25 minutes! And almost all of the work was done by one of the parallel workers, so elapsed time was over 25 minutes as well!
What about the three logically equivalent queries below?
Wow!! 178 ms, 2 ms, 2 ms. I repeated several times to eliminate overhead from query plan compile, etc. So the original query took over 25 minutes - but making the implicit "IS NOT NULL" filter explicit cuts the time down to 2 ms!
I believe that's a bug in the optimizer - seems like it should consistently interpret joins of NULLable columns as if the "IS NOT NULL" filter is explicit. I've opened a ticket - lets see if the engineers agree with me :-)
I had something else I wanted to learn from this adventure. These queries leverage hash joins. Many online sources send mixed messages about hash joins - at once lauding their scalability while warning they may not scale well.
Well - what happens to CPU time accumulated when executing this query with different table sizes? I kept X_1 at 600000 rows, and experimented with the number of rows in X_2.
The first graph below shows two different linear patterns for CPU time. At 20609 X_2 rows and above, CPU time accumulated according to the steeper pattern. At 17814 X_2 rows and below, CPU time accumulated according to the less steep linear function.
This is a closeup showing the range where the two linear CPU time functions overlap, between 17815 and 20608 X_2 rows.
Here's a table of the values - for me this is sometimes more useful than fancy graphs. So two takeaways:
1. If you join NULLable columns, look out for this issue!
2. If you can't make SQL Server do the right thing - maybe you can make it do the other thing faster? That would depend on catching the lower slope function rather than the steeper function for CPU time.