Monday, February 28, 2022

#SQLServer Trivial Plans for Inserts; Stats Expectations and Reality

OK.  SQL Server trivial plans for rowstore table INSERT. And related optimizer stats interaction.

TL;DR cached trivial plans for INSERT can be surprisingly stubborn. If a query matches to one, it won't perform or queue a stats update even if the stats are stale.  If the stats have been updated and would otherwise warrant a per-index plan - but there is a matching cached trivial plan for a per-row plan... outta luck. Might hafta DBCC FREEPROCCACHE or add OPTION(RECOMPILE) hint to make sure a cached trivial plan doesn't prevent a per-index update for an INSERT when you really want one.

~~~~ 

The blog post referenced below, first published 2015 March 16, is pretty good.

The relevant takeaway from that post is: in a database with AUTO_UPDATE_STATISTICS = ON, a statistics update will NOT invalidate a relevant cached query plan. A subsequent query which qualifies to use a cached plan will continue to use the cached trivial plan. This is Scenario 1 in the blog post. The following text is directly from the blog post.

When a plan is trivial, it’s unnecessary to recompile the query even statistics has been updated.  Optimizer generates trivial plan for very simple queries (usually referencing a single table).  In XML plan, you will see statementOptmLevel="TRIVIAL". In such case, it’s futile and you won't get a better or different plan.

The demo code in the blog post uses SELECT queries only. In my examples in my own blog post, I'm concerned with INSERT queries.  Are INSERT queries eligible for both trivial plans and plans that have been more fully optimized? If INSERT queries can use a trivial plan, is it possible for a cached trivial plan to remain in use when a better and/or different plan could be selected?

Perhaps the reader is familiar with the Paul White blog post referenced below, from 2013 January 26. In that blog post, Paul White discusses wide, per-index INSERT query plans and narrow, per-row INSERT query plans.  How do these plan types relate to trivial plans?

OK. Enough jibber-jabber. Time to do something. Like create a stored procedure.  Which will be used in a database with AUTO_UPDATE_STATISTICS = ON and AUTO_UPDATE_STATISTICS_ASYNC = OFF.

This stored procedure will drop two tables if they already exist, then create those two tables.  The source table has three columns and a single index - the clustered primary key.  The target table also has three columns and a clustered primary key. In addition the target table has two non-clustered indexes.

The @insert_no_rows parameter controls whether or not there will be an INSERT query from the empty source table to the empty target table. That wouldn't move any data - but maybe it will cache a plan?

The INSERT query is dynamic SQL - both in the stored procedure and when I issue it adhoc later.  It isn't necessary for this query to be dynamic SQL - it was just easier that way to make sure the batch SQL text matched and plan re-use happened when it was a possibility. While writing this blog post at one point the query SQL text in the stored procedure has a different number of tabs preceding the formatted T-SQL than appeared in my adhoc query afterward and it took me forever to figure out why plan re-use wasn't occurring. :-) 

CREATE OR ALTER PROCEDURE test__trivial_plan_inserts @insert_no_rows INT = 0, @x INT = 0
AS
BEGIN
	DROP TABLE IF EXISTS trivial_test_source;
	DROP TABLE IF EXISTS trivial_test_target;

	CREATE TABLE trivial_test_source
	(	col1 INT CONSTRAINT pk__trivial_test_source PRIMARY KEY CLUSTERED
	,	col2 INT
	,	col3 INT
	);

	CREATE TABLE trivial_test_target
	(	col1 INT CONSTRAINT pk__trivial_test_target PRIMARY KEY CLUSTERED
	,	col2 INT INDEX nci__trivial_test_target__col2
	,	col3 INT INDEX nci__trivial_test_target__col3
	);

	-- white space in the INSERT dynamic SQL must match exactly
	-- in stored procedure and adhoc EXEC for cached plan reuse
	IF @insert_no_rows = 1
	BEGIN
		DECLARE @sqlT NVARCHAR(1000) = N'INSERT INTO trivial_test_target SELECT * FROM trivial_test_source;'
		EXEC (@sqlT);
	END
	-- white space in the INSERT dynamic SQL must match exactly
	-- in stored procedure and adhoc EXEC for cached plan reuse

	;WITH	n0 AS (	SELECT TOP (32) n = 1 
					FROM master.dbo.spt_values)
	,		n1 AS (	SELECT n = ROW_NUMBER() OVER (ORDER BY(SELECT NULL))
					FROM n0 t1 CROSS JOIN n0 t2)
	INSERT INTO	trivial_test_source
	SELECT		TOP (@x) n, n, n
	FROM		n1;
END

For the first experiment, let's not worry about caching plans. Let's call the stored procedure like this:

EXEC test__trivial_plan_inserts @insert_no_rows = 0, @x = 1023;

So, give me 1023 rows in table trivial_test_source.

After the stored procedure executes, let's grab an actual plan from this simple INSERT query:

DECLARE @sqlT NVARCHAR(100) = N'INSERT INTO trivial_test_target SELECT * FROM trivial_test_source;'
EXEC (@sqlT);
-- white space in the INSERT dynamic SQL must match exactly
-- in stored procedure and adhoc EXEC for cached plan reuse

And here's the actual plan.  That's a narrow, per-row plan. Note the estimate of 1023 rows; dead-on.



The Object properties in the graphical plan specify the target table indexes which will be updated for each row inserted into the clustered index. So far, so good.














































If we look at the plan XML, we can see this is a trivial plan.


OK. Now let's call the stored procedure like this:
EXEC test__trivial_plan_inserts @insert_no_rows = 0, @x = 1024;


That will drop the source and target tables, recreate the tables, and populate the source table with 1024 rows instead of the previous 1023.

Now get an actual plan for the insert again:

DECLARE @sqlT NVARCHAR(100) = N'INSERT INTO trivial_test_target SELECT * FROM trivial_test_source;'
EXEC (@sqlT);
-- white space in the INSERT dynamic SQL must match exactly
-- in stored procedure and adhoc EXEC for cached plan reuse

And... bam!!! Yeah, the graphical plan below certainly looks like a different plan. This is what Paul White refers to as a wide, per-index update.

First, at the right of the blue box below, the source table scan operator feeds rows into the clustered index insert operator. No sort is needed to optimize the insert because the source and target tables have the same primary key clustered index definition. These rows are also fed into a table spool at the left of the blue box below.  The table spool at the left of the blue box is the same table spool at the left of the gold box - just different zones of the same plan.


Table spool populated, it is used as a source for a sort, then insert into a non-clustered index.  This happens once in the upper blue box, then again in the lower gold box.


Demonstrating the benefit of a per-index update over a per-row update for an INSERT of many thousand rows is left as an exercise for the reader.

While I won't investigate at this time, I do want to make note of an interesting switcheroo that takes place. Recall that within the CREATE TABLE, the clustered primary key and the two non-clustered indexes on col2 and col3 were created with inline syntax.







How did the non-clustered index on col3 become index_id 2, while the nonclustered index on col2 became index_id 3?  I don't know - and i don't know of anywhere that could be consequential. Yet.














How did I notice that little switcheroo?  Well, every per-index INSERT graphical plan I've seen has shown indexes in index_id order from top-down.  And when I've observed large per-index updates with sys.dm_tran_locks, the acquisition of locks over time by the session indicates indexes are tended to in index_id order.  I won't at this time be investigating further whether per-index updates always handle indexes in index_id order, or how the little index_id switcheroo occurred. Just an interesting bread crumb.

Before we do the next part I want to make sure the database is set up like I think...
Excellent, exactly what I wanted.










So. This time we drop the tables and recreate them. We issue the INSERT INTO... SELECT while both source and target tables are empty. Then populate the source table with 1024 rows.

EXEC test__trivial_plan_inserts @insert_no_rows = 1, @x = 1024;

And now we are ready to grab an actual plan...

DECLARE @sqlT NVARCHAR(100) = N'INSERT INTO trivial_test_target SELECT * FROM trivial_test_source;'
EXEC (@sqlT);
-- white space in the INSERT dynamic SQL must match exactly
-- in stored procedure and adhoc EXEC for cached plan reuse

So now we have 1024 rows - that number of rows previously got a per-index update plan.  But this time it got a per-row plan. Huh.









Well, what the heck.  If we look at the plan XML we see a clue. RetrievedFromCache.



Now wait just a minute. That source table went from 0 rows to 1024 rows; did the stats get updated?

Wow. I guess not. When the insert matched a cached trivial plan, the stats did not get updated even though they were stale and auto update stats is true in this database.


So - what if I try this again? Drop the tables, recreate the tables, run the insert with zero rows to put a plan in the cache. Put 1024 rows in the source table. Then... I'll explicitly update stats.

EXEC test__trivial_plan_inserts @insert_no_rows = 1, @x = 1024;
UPDATE STATISTICS trivial_test_source;

OK, stats are updated.




So what happens now? Ouch. It doesn't matter.  If the stats are stale and there's a matching cached trivial plan - the stats don't get updated.  If the stale stats ARE updated before the query is executed - the matching trivial plan isn't invalidated. So that matching cached trivial plan will still be used. Even though I don't want it to be used. I just want SQL Server to take a fresh look at what's going on.

And that fresh look can solve the problem.  Clearing the plan cache with DBCC FREEPROCCACHE (or ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE CACHE in the relevant database) would cause SQL Server to take a fresh look, and decide if the INSERT should get a per-index plan instead of a per-row plan*.  Tacking OPTION(RECOMPILE) hint on to the end of that INSERT would also work in a much more targeted manner. Sometimes, in order to prevent a cached trivial plan from forcing a per-row update when we really want a per-index update, gonna hafta encourage SQL Server to take a fresh look in one of those ways.

*There are a number of SQL Server instance-wide configuration settings changes to which will also clear plan cache, such as maxdop and cost threshold for parallelism. But i don't recommend changing any of them solely to clear plan cache. Microsoft lists the configuration options which clear plan cache on the following BOL page.