Friday, January 3, 2014

Trace Flag 4199: Complex Risk Assessment Part I

**Update**
I may have missed some rolled up trace flags, or some flagless fixes activated by T4199.  Finally, I may find some information about some 'undocumented' fixes included in T4199.   I've added references for those that I've found so far to a followup second post on this topic.

Trace Flag 4199: Complex Risk Assessment: Part II

**End update**

Recently a question came up on Twitter #sqlhelp about kb2904010, a recent fix for sample size in sample stats updates (kb article is linked below).  The original question was about determining the applicability of that fix to a given system.  For a given query, evaluating the appropriateness isn't too bad... but for a change like that you really should evaluate across the system.  And it seemed to me that the following procedure would be appropriate:
1. update all stats on the system with their typical sampling rate
2. save the stats info
3. apply and activate the fix
4. update all stats again with their typical sampling rate
5. compare stats with and without the fix

Here's the reason: if you only evaluate a single query, the effects of the fix on other queries are unknown.  In fact, that's true for any defined set of queries.  Even if you evaluate all current queries, what about the query tomorrow that is executed for the first time against existing data/indexes/stats keys?  So, in this case, comparing the effects across all current stats keys with their current sampling rates, while not necessary  gives information for risk assessment that wouldn't be available if the effect on only a subset of queries is evaluated.

I didn't say all of that on Twitter :) I'm long-winded in real life, but I try to behave when limited to 140.

There was a much shorter answer which took that particular exchange where it needed to go: the risk assessment of the fix in 2904010 is vastly increased by the fact that it is activated by trace flag 4199 (and does not have a separate trace flag which enables the fix individually).  You see, if that organization wasn't already using trace flag 4199, a two state test would not really be sufficient for risk assessment.  They'd have to have a 3 state test: baseline, baseline + T4199, baseline + T4199 + kb2904010. 

Why?  Cuz T4199 pulls a lotta stuff along with it, that's why.  In the days of my youth this is the type of roll-up fix that I would set out testing combinatorically - guessing which fixes would matter most to the system and trying to implement them singly.  Results with single fixes could be compared to baseline and with the full rollup.  Single fixes that showed strong positive returns in testing could be combined, with results compared to baseline and full rollup.

Wanna try?  I did.  But because there are 27 separate trace flags all rolled into trace flag 4199, I got tired out after running 1 test query through all possibilities.  Links below to the 28 kb hot fixes for these trace flags (one trace flag has a separate kba for SQL Server 2005 and SQL Server 2008).

Then... well, then come the 6 fixes which are activated by trace flag 4199 which CANNOT be activated separately.  You want 'em?  Gotta take the whole kaboodle.

That's a pretty complex risk assessment.  Now I will say this: there are considerable gains possible with trace flag 4199.  Just based on the activity of trace flag 4101 I saw the memory grant of some queries shrink from over 10GB to less than 2 mb!  And none of my test queries turned out with plans that were significantly worse on the test data set.  I haven't seen examples of chaos caused by trace flag 4199, and I know that Adam Machanic uses it widely (though he does of course encourage caution in adopting it).  So there are considerable benefits possible... I just wanted to try to put some documentation out there that would be useful for risk assessment.


In the tables below, general kba that mention T4199 appear first.  Then, sorted by trace flag number, the trace flag kbs for trace flags rolled up into T4199.  The MSSQL Version noted there is the earliest major version with the kb hotfix (not necessarily the version with the earliest calendar date availability of the hotfix).

Finally, sorted by kb article number, the 6 hotfixes I found that are activated by T4199 and CANNOT be activated individually.

Enjoy!  Hyperlinks behind each KBa number.


MS KBa
Description


Trace flags rolled into T4199


Documents T4199 w/querytraceon


MS KBa
MSSQL Version Desc
Trace flag
Description
SQL Server 2000
4101
outer joins w/filter criteria
SQL Server 2005
4102
semi joins; mentions T4118??
SQL Server 2005
4102
optimizer timeout; mentions T4122 in err?
SQL Server 2005
4104
multi-column join cardinality
SQL Server 2005
4105
index union cost
SQL Server 2005
4106
multi-join query compile time
SQL Server 2005
4107
range filter w/multiple table partitions
SQL Server 2005
4108
fast forward-only cursor
SQL Server 2005
4109
error w/fast forward-only cursor
SQL Server 2005
4110
query w/user-defined scalar function
SQL Server 2005
4111
replication merge agent
SQL Server 2005
4115
forward-only cursor
SQL Server 2005
4116
join of local and remote table
SQL Server 2005
4117
blocking issue for row updates
SQL Server 2005
4119
LIKE and comparison operator
SQL Server 2005
4120
cursor deadlock
SQL Server 2005
4121
outer joins; requires T4101


4122 mentioned in 946020 in err?
SQL Server 2005
4124
BIGINT statistics
SQL Server 2005
4125
inner join of derived table w/DISTINCT
SQL Server 2005
4126
transactional replication
SQL Server 2005
4127
long compile time
SQL Server 2005
4128
cursor for updates
SQL Server 2008
4128
cursor for updates
SQL Server 2005
4129
GETDATE() in table copies
SQL Server 2005
4131
ON clause of query JOIN w/LIKE
SQL Server 2005
4133
query notifications and error log
SQL Server 2008
4135
error 605/804 on temp table insert
















MS KBa
MSSQL Version Desc
Trace flag
Description
SQL Server 2008
NULL
parameterized CTE query
SQL Server 2008
NULL
CHANGETABLE function
SQL Server 2005
NULL
multiple self-joins on same column
SQL Server 2008
NULL
sp_cursoropen
SQL Server 2012
NULL
linked server w/different collation
SQL Server 2008 R2
NULL
stats update sample size

1 comment: