Tuesday, January 31, 2023

#SQLServer - USE HINT ('DISABLE_OPTIMIZER_ROWGOAL') - SURPRISE! TOP operator in graphical query plan

 Hello!!!

 Sometimes perhaps the things I care about will really leave people wondering. :-)

In this case, it's a SQL Server graphical query plan oddity.

I don't think this behavior is necessarily related to a performance problem.

But I think this behavior can make it *harder* to troubleshoot SQL Server graphical query plans for performance.  This is a case where a TOP operator shows up in a graphical plan unexpectedly - and seemingly against the expressed intent of the USE HINT ('DISABLE_OPTIMIZER_ROWGOAL') hint.

Here are some sources for additional background about the general idea.

KB2667211 - A query may take a long time to run if the query optimizer uses the Top operator in SQL Server 2008 R2 or in SQL Server 2012
https://support.microsoft.com/en-us/topic/kb2667211-a-query-may-take-a-long-time-to-run-if-the-query-optimizer-uses-the-top-operator-in-sql-server-2008-r2-or-in-sql-server-2012-780ec760-894a-3aac-0d61-e954ea05633f

KB4051361 - Optimizer row goal information in query execution plan added in SQL Server 2014, 2016 and 2017
https://support.microsoft.com/en-us/topic/kb4051361-optimizer-row-goal-information-in-query-execution-plan-added-in-sql-server-2014-2016-and-2017-ec130d21-1adc-3d3d-95a5-cdb075722269

Inside the Optimizer: Row Goals In Depth
Paul White
@sql_kiwi
2010 August 18
https://www.sql.kiwi/2010/08/inside-the-optimiser-row-goals-in-depth.html

All right. Game on.

A little bit of info about the system.


Let's do some setup (I'll append all of the T-SQL at the end of the blog post for those that want to follow along at home).

 


OK, let's see what our folly hath wrought. Good, good. Let the A9 flow...

I know this query seems almost non-sensical.  But I assure you it was lovingly distilled from a much more complex case I ran into in the field.  Sure, the query could be simplified.  But that's not the point of this exercise :-) Just grabbing the estimated plan here, because that's all I care about in this repro.

(If t1 and t0 were actually different heaps this would be about as good SQL Server could do without any indexes.)


OK, it's a funky query... but nothing too unusual yet in the graphical plan.

So we've got a query that probably no *person* would write and the graphical plan isn't really too objectionable.  Why don't we add a hint that no *person* would add to this query, for good measure?  One that would be expected to have no effect? If there isn't an apparent row goal already, what change could come about if we added the DISABLE_OPTIMIZER_ROWGOAL hint?

Oh. That's weird.  A TOP operator magically appeared in the plan.

The attribute EstimateRowsWithoutRowGoal (discussed in the second KB article linked above) doesn't appear in the plan XML.  So... I guess this is "row goal lite" or something?

 

Martin Smith recommended testing the a hint to disable a rule: QUERYRULEOFF GbAggToConstScanOrTop.

The surprise TOP operator still shows up in the estimated plan.

 


And... I didn't realize this until after I initially published this blog post.

The sum of the costs of the operators in the graphical plans with the surprise TOP operator... is 106%.

 

~~~

Here's the T-SQL for those that want to have their own fun :-)


CREATE TABLE #temp (col1 NVARCHAR(300) NOT NULL);

;WITH 
  n1 AS (SELECT n = 1 UNION ALL SELECT 1)
, n2 AS (SELECT n = 1 FROM n1 a1 CROSS JOIN n1 a2)
, n3 AS (SELECT n = 1 FROM n2 a1 CROSS JOIN n2 a2)
, n4 AS (SELECT n = 1 FROM n3 a1 CROSS JOIN n3 a2)
, n5 AS (SELECT n = 1 FROM n4 a1 CROSS JOIN n4 a2)
, n6 AS (SELECT n = 1 FROM n5 a1 CROSS JOIN n3 a2)
, n7 AS (SELECT n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM n6)
INSERT INTO #temp
SELECT CASE WHEN n%11=0 THEN 'A1'
            WHEN n%10=0 THEN 'A2'
            WHEN n%9=0 THEN 'A3'
            WHEN n%8=0 THEN 'A4'
            WHEN n%7=0 THEN 'A5'
            WHEN n%6=0 THEN 'A6'
            WHEN n%5=0 THEN 'A7'
            WHEN n%4=0 THEN 'A8'
            ELSE 'A9'
       END
FROM n7;

SELECT   col1, how_many = count(*) 
FROM     #temp 
GROUP BY col1
ORDER BY col1 ASC;

SELECT  col1
FROM    #temp t0
WHERE   EXISTS (    SELECT 1
					FROM   #temp t1
					WHERE  t1.col1 = 'A9'
					AND    t0.col1 = t1.col1 )
OPTION(RECOMPILE, USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION');

SELECT  col1
FROM    #temp t0
WHERE   EXISTS (    SELECT 1
					FROM   #temp t1
					WHERE  t1.col1 = 'A9'
					AND    t0.col1 = t1.col1 )
OPTION(RECOMPILE, USE HINT('FORCE_DEFAULT_CARDINALITY_ESTIMATION', 'DISABLE_OPTIMIZER_ROWGOAL'));