Wednesday, September 24, 2014

Trace flag 4199 impact on Optimizer Memory Use; RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam

 Its an ugly query, but I have to show it some love.

This is one of the queries responsible for the RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam I noticed a few days ago.

On a whim, I asked the performance engineer to disable trace flag 4199.  The logjam disappeared - but system workload performance wasn't great.  I didn't expect it to be: to get the query concurrency desired with the available query memory and to get efficient plans at least the join re-ordering of trace flag 4101 is needed and all the other optimizer fixes rolled into trace flag 4199 probably also (in sum) make a positive contribution.

But with the logjam now associated with trace flag 4199, I've got to figure out why T4199 is resulting in more optimizer memory consumption.

I grabbed an estimated plan from the query with the following tacked on:

Then I cleared procedure cache and grabbed an estimated plan with the same trace flag list minus T4199.

Both query plans indicated early abort of the optimizer:


With T4199, the diagnostic trace flags 2372 & 2373 spit out 89,932 lines of text, finishing at 7628 for memory.
Without trace flag 4199, they spit out 119,790 lines of text, finishing at 3652 for memory.

Up to line 55,051 the two diagnostic captures are almost identical - the numbers stay within 1 of each other.  After line 55,051 the rules and the memory amounts diverge, as indicated by the graph above.

I'm gonna need a bigger boat :-)

Hopefully I can pique Paul White's interest, and maybe he can teach me an efficient way of picking some insight out of the 29,858 text lines after the T4199 version of the query diverges.  Ultimately I'd like to get the query not to time out of the optimizer... but getting it not to cause RESOURCE_SEMAPHORE_QUERY_COMPILE wait logjam is the first order of business.