Wednesday, October 14, 2015

Performance Pitfall - Part Deux!! SQL Server 2014 joins on NULLable columns - with 2012 compatibility mode

***** Update *****
#CarefulWhatYouAskFor :-)
Paul White (@SQL_kiwi) was willing to take a look at this, and gave it a very thorough treatment in the blog post below.  Thanks, Paul!

Hash Joins on Nullable Columns

The following post is a companion to yesterday's post.  For ease of reference:
Performance Pitfall!! SQL Server hash joins on nullable columns

Yesterday's post was work done on SQL Server 2014 with 2014 compatibility mode.  But that's not where the work actually started.  Today I'll post about similar work on SQL Server 2014 with 2012 compatibility mode.  Ok, ok... that's also not where the work started.  But its as far back as I'll go because I'm still about 12 blog posts behind :-)

In yesterday's blog post I introduce my X_1 and X_2 testing tables.  In column two of each of these tables, NULL values are extremely frequent by design to reproduce this performance berserker.

Recall the (in my opinion) innocent-looking query that caused trouble yesterday:

FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2;

I maintain that the basic problem is one of algebra - although the joins by definition excluded NULL values in the join columns, for some reason SQL Server treats the query differently when the NULL exclusions are made explicit.  I'll get to that later.

First I want to turn to the other area of interest - predicting scaling behavior of this query when it does the wrong thing.  Does this query with 2012 compatibility mode scale along two different linear patterns, as was seen in SQL Server 2014?

Yesterday I used an X_1 table of 600,000 rows and an X_2 table of up to 32,000 rows to demonstrate the two linear scaling functions for the query as X_1 maintained constant row count and X_2 rowcount varied.

I ain't getting any younger, so today I used an X_1 table of 72,048 rows and an X_2 table of up to 32,000 rows for the same purpose with 2012 compatibility mode.  (Many of those queries in yesterday's post required over 25 minutes to complete! Too long to watch a hash join complete, I say!!) I'll leave determining why I used exactly 72,048 rows as an exercise for the reader. :-)

So... here's how CPU_ms varied with X_2 rowcount against an X_1 table with 72,048 rows.

And... here's a close-up of that very interesting area where there's a switch between the two linear functions.

So... some pics of the plans are coming.  And other stuff.  But I wanted to get this posted before I doze off for the night.

Here's a reminder.  Yeah, the scaling of the query is interesting - I want to do additional work in the future to see if this type of pattern shows up in hash joins of non-null values.  But... I still think in this case I am describing the scaling of SQL Server doing the wrong thing.

Because this query...

FROM       X_1 AS t1
INNER JOIN X_2 AS t2 ON t1.NOT_PK_X_1 = t2.NOT_PK_X_2
INNER JOIN X_2 AS t3 ON t2.PK_X_2     = t3.PK_X_2
... with 72,048 X_1 rows and 32,000 X_2 rows requires only 1 millisecond of CPU time.  Rather than 181641 ms of CPU time.  That's a pretty hefty performance return for making an implicit filter explicit.

No comments:

Post a Comment