Here's the deal. If a quickstats query is issued to amend the histogram for the first ordinal key of a filtered index... it doesn't include the filter in the query. Without the filter in the query, the filtered index can't satisfy the query, even if it was the reason for the query. Unless there's another nonfiltered index with the same first ordinal key to satisfy the quickstats query, a full table scan will be needed to get the results of the quickstats query.
For a 600 million row table that can take a while. Here's a little salt for the wound. Multiple tasks in query compilation may load the filtered index stats - for example if the first key of the index is in an "col=X or col=Y or col=Z" situation. Each task that loads the filtered index will result in another quickstats query - the amended histogram isn't retained from one task to the next while the plan is compiling. Even if there is no change to the underlying data in the interim. Takes 3 minutes to complete a quickstats query due to a full table scan? And there's 6 of them? Just added 18 minutes to the query compile time. Yikes.
And the next query against the same data and statistics runs into the same behavior.
Check out Michael J Swart's connect item, and the repro details behind the "Details" link in the connect item. Notice in his results... the quickstats query for the filtered index doesn't have the expected filter.
Trace flag 2390 can cause large compile time when dealing with filtered indexes. (Active)
https://connect.microsoft.com/SQLServer/feedback/details/2528743/
In this blog post, I detail my theory that SQL Server 2014 SP1 CU9/SP2 CU2 kb3189645 significantly changes behavior in these cases by exempting filtered indexes from quickstats queries.
SQL Server Trace flags 2389/2390/4139 ~ Quickstats Queries for Filtered Indexes?
http://sql-sasquatch.blogspot.com/2016/12/sql-server-trace-flags-238923904139.html
***Update 2016 Dec 9***
Trace flag 4139 is part of the stew, as is the legacy cardinality estimater.
Forcing new cardinality estimater with QUERYTRACEON 2312 allowed fast compile.
So did global disable of trace flag 4139 - even though trace flags 2389 and 2390 remained enabled.
Brute force method - manual update of ALL stats on the table (which was unchanging during these tests) also allowed fast compile.
So... looks like quickstats updates of these filtered, include indexes due to trace flag 4139 is part of the problem. Maybe the quickstats updates are failing?
Not sure yet.
Additional work on this issue here:
http://sql-sasquatch.blogspot.com/2016/12/retrieve-recent-sql-server-stats-update.html
And here:
http://sql-sasquatch.blogspot.com/2016/12/sql-server-trace-flag-2388-shows.html
*****
2016 Nov 30
If this issue isn't reproducible again tomorrow I'll be upset. Hoping to finally get to the bottom of this. Let me know if you have ideas to investigate. My plan is to use trace flag 9292 and 9204 to find out which stats the optimizer is interested in, Then create a new, empty copy of table Fact, transplanting the stats to the empty table. And hopefully compiling the query against the empty table will take just as long. If I can reproduce the problem in a completely clean context I'll have the most options for collaborative investigation.
A very simple validation query is being issued (below) - a filtered include index exists that satisfies all conditions with the first row returned.
And in fact execution time itself is very, very fast. But... compile time is ridiculous. Over 3 minutes. Output from trace flag 8675 is below. Time to compile was almost identical with and without trace flag 8757 added to the mix.
This system is SQL Server 2012 SP3. Because its a production system and production data, no windbg. But if the issue is present in a query against an empty table with transferred stats, next step will be trying to port the empty table, stats, and problem to a dev instance :-)
SELECT TOP 1 Id_1
FROM Fact
WHERE Category_1 = '55'
AND Id_1 IS NOT NULL
AND Bit = 1
AND String_1 IS NOT NULL
OPTION (QUERYTRACEON 3604, --redirect trace flag output
QUERYTRACEON 8675, --optimizer stages & timing
QUERYTRACEON 8757, --no early quit with trivial plan
RECOMPILE)
End of simplification, time: 0 net: 0 total: 0 net: 0
end exploration, tasks: 36 no total cost time: 198.649 net: 195.236 total: 198.65 net: 195.237
end search(1), cost: 0.0032832 tasks: 74 time: 0 net: 0 total: 198.65 net: 195.237
End of post optimization rewrite, time: 0 net: 0 total: 198.651 net: 195.237
End of query plan compilation, time: 0 net: 0 total: 198.651 net: 195.237
****
The elapsed time for the session was almost completely CPU time - not dominated by waits at all. There was some other activity on the server at the time of observation - but not much. Here are the spinlock deltas from measurements before and after the query was executed.
name | collisions | spins | sleep_time | backoffs |
SOS_BLOCKDESCRIPTORBUCKET | 365291 | 893170987 | 827998 | 223884 |
SOS_OBJECT_STORE | 117525 | 425010979 | 126664 | 90510 |
SOS_CACHESTORE_CLOCK | 787 | 138375000 | 2319 | 584 |
That doesn't tell me much... since just about the only web search hits for SOS_BLOCKDESCRIPTORBUCKET are my own tweets wondering what they might mean :-)
An idea occurred to me - maybe this resulted from an overly full plan cache hash bucket?
SELECT plans_in_bucket,count(*) as number_of_buckets
FROM
(
SELECT bucketid,count(*) as plans_in_bucket
FROM sys.dm_exec_cached_plans
GROUP BY bucketid
) inr
GROUP BY plans_in_bucket
ORDER BY plans_in_bucket DESC
plans_in_bucket
|
number_of_buckets
|
46
|
1
|
39
|
1
|
36
|
1
|
34
|
1
|
30
|
1
|
24
|
2
|
23
|
1
|
22
|
1
|
20
|
2
|
19
|
4
|
17
|
2
|
16
|
4
|
15
|
7
|
14
|
8
|
13
|
15
|
12
|
16
|
11
|
34
|
10
|
128
|
9
|
264
|
8
|
708
|
7
|
1499
|
6
|
3115
|
5
|
5158
|
4
|
7484
|
3
|
8631
|
2
|
7398
|
1
|
4242
|
Not likely with the high count bucket at 46. Total cached plans: 140630.