Tuesday, January 31, 2023

#SQLServer - USE HINT ('DISABLE_OPTIMIZER_ROWGOAL') - SURPRISE! TOP operator in graphical query plan

 Hello!!!

 Sometimes perhaps the things I care about will really leave people wondering. :-)

In this case, it's a SQL Server graphical query plan oddity.

I don't think this behavior is necessarily related to a performance problem.

But I think this behavior can make it *harder* to troubleshoot SQL Server graphical query plans for performance.  This is a case where a TOP operator shows up in a graphical plan unexpectedly - and seemingly against the expressed intent of the USE HINT ('DISABLE_OPTIMIZER_ROWGOAL') hint.

Here are some sources for additional background about the general idea.

KB2667211 - A query may take a long time to run if the query optimizer uses the Top operator in SQL Server 2008 R2 or in SQL Server 2012
https://support.microsoft.com/en-us/topic/kb2667211-a-query-may-take-a-long-time-to-run-if-the-query-optimizer-uses-the-top-operator-in-sql-server-2008-r2-or-in-sql-server-2012-780ec760-894a-3aac-0d61-e954ea05633f

KB4051361 - Optimizer row goal information in query execution plan added in SQL Server 2014, 2016 and 2017
https://support.microsoft.com/en-us/topic/kb4051361-optimizer-row-goal-information-in-query-execution-plan-added-in-sql-server-2014-2016-and-2017-ec130d21-1adc-3d3d-95a5-cdb075722269

Inside the Optimizer: Row Goals In Depth
Paul White
@sql_kiwi
2010 August 18
https://www.sql.kiwi/2010/08/inside-the-optimiser-row-goals-in-depth.html

All right. Game on.

A little bit of info about the system.


Let's do some setup (I'll append all of the T-SQL at the end of the blog post for those that want to follow along at home).

 


OK, let's see what our folly hath wrought. Good, good. Let the A9 flow...

I know this query seems almost non-sensical.  But I assure you it was lovingly distilled from a much more complex case I ran into in the field.  Sure, the query could be simplified.  But that's not the point of this exercise :-) Just grabbing the estimated plan here, because that's all I care about in this repro.

(If t1 and t0 were actually different heaps this would be about as good SQL Server could do without any indexes.)


OK, it's a funky query... but nothing too unusual yet in the graphical plan.

So we've got a query that probably no *person* would write and the graphical plan isn't really too objectionable.  Why don't we add a hint that no *person* would add to this query, for good measure?  One that would be expected to have no effect? If there isn't an apparent row goal already, what change could come about if we added the DISABLE_OPTIMIZER_ROWGOAL hint?

Oh. That's weird.  A TOP operator magically appeared in the plan.

The attribute EstimateRowsWithoutRowGoal (discussed in the second KB article linked above) doesn't appear in the plan XML.  So... I guess this is "row goal lite" or something?

 

Martin Smith recommended testing the a hint to disable a rule: QUERYRULEOFF GbAggToConstScanOrTop.

The surprise TOP operator still shows up in the estimated plan.

 


And... I didn't realize this until after I initially published this blog post.

The sum of the costs of the operators in the graphical plans with the surprise TOP operator... is 106%.

 

~~~

Here's the T-SQL for those that want to have their own fun :-)


CREATE TABLE #temp (col1 NVARCHAR(300) NOT NULL);

;WITH 
  n1 AS (SELECT n = 1 UNION ALL SELECT 1)
, n2 AS (SELECT n = 1 FROM n1 a1 CROSS JOIN n1 a2)
, n3 AS (SELECT n = 1 FROM n2 a1 CROSS JOIN n2 a2)
, n4 AS (SELECT n = 1 FROM n3 a1 CROSS JOIN n3 a2)
, n5 AS (SELECT n = 1 FROM n4 a1 CROSS JOIN n4 a2)
, n6 AS (SELECT n = 1 FROM n5 a1 CROSS JOIN n3 a2)
, n7 AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM n6)
INSERT INTO #temp
SELECT CASE WHEN n%11=0 THEN 'A1'
            WHEN n%10=0 THEN 'A2'
            WHEN n%9=0 THEN 'A3'
            WHEN n%8=0 THEN 'A4'
            WHEN n%7=0 THEN 'A5'
            WHEN n%6=0 THEN 'A6'
            WHEN n%5=0 THEN 'A7'
            WHEN n%4=0 THEN 'A8'
            ELSE 'A9'
       END
FROM n7;

SELECT   col1, how_many = count(*) 
FROM     #temp 
GROUP BY col1
ORDER BY col1 ASC;

SELECT  col1
FROM    #temp t0
WHERE   EXISTS (    SELECT 1
					FROM   #temp t1
					WHERE  t1.col1 = 'A9'
					AND    t0.col1 = t1.col1 )
OPTION(RECOMPILE, USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION');

SELECT  col1
FROM    #temp t0
WHERE   EXISTS (    SELECT 1
					FROM   #temp t1
					WHERE  t1.col1 = 'A9'
					AND    t0.col1 = t1.col1 )
OPTION(RECOMPILE, USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION', 'DISABLE_OPTIMIZER_ROWGOAL'));





1 comment:

  1. I consider myself an expert at reading execution plans, not at how they are created. The true source of knowledge about the optimizer would be Paul White.
    Nevertheless, I do in this case have a theory.

    The EXISTS in the query is implemented in the execution plan with a Nested Loops operator doing a Left Semi Join operation. A semi join (or anti semi join) on Nested Loops means that it will always stop processing the inner (lower) input after finding the first matching row, and immediately continue with the next outer (upper) row. So this means that the lower input has an implied rowgoal of 1. And this rowgoal is taken into account when costing the lower input. Instead of expecting to read all rows from #t1 on each execution, it uses some formulas to estimate how many rows it will return on the average execution before the left semi join finds a match and stops asking for more rows.

    Then you disable the rowgoal. So now, the cardinality estimation and the costing logic are based on how much it would cost to let the Table Scan on #t1 read and return all rows in the table instead of stopping after the first match. That by itself would cause a huge increase in the estimated cost. But the optimizer still tries to get the cost down. It realizes that now, without the rowgoal, the Table Scan does a lot of useless work. (This is a wrong assumption, Nested Loops will still stop asking for more rows after the first match ... but your hint forced the optimizer to work on that wrong assumption, so it does what you tell it to do). Realizing that Table Scan does a lot of work that is not needed, the optimizer then tries to find a way to reduce that work. It knows only the first matching row is actually needed. And so it injects a Top operator to ensure that the Table Scan is no longer called once a matching row has been found.

    The estimated plan cost of the second execution plan is very slightly higher (0.11%) than that of the unhinted plan. The Top operator is a very cheap operator, but nothing is entirely free. That little bit of extra work, for one row of each execution, multiplied by almost 400,000 executions, causes a minute increase in estimated cost. So I guess this (once more) proves that it is almost always better to let the optimizer work without interfering with hints.

    ReplyDelete