Thursday, May 2, 2013

O Registry, Registry! But wherefore did you so much tempt the heavens?

O Registry, Registry!  But wherefore did you so much tempt the heavens?
Whither WinsockListenBacklog?
Whither EnableDynamicBacklog?
Whither DynamicBacklogGrowthDelta, MaximumDynamicBacklog?
Whither MinimumDynamicBacklog away so fast?

Apologies from the Sasquatch to the Bard.  Its the coffee talkin'.

Under stress of concurrent TCP connection attempts, various errors can be reported by SQL Server as connections are refused.  Below is a good description of one of the connection failure modes.  Login authentication failures (regardless of credentials supplied) can also be reported under concurrent connection stress.
SQL Server logs 17832 with multiple TCP\IP connection requests
http://support.microsoft.com/kb/154628

The following support document is a classic. Contains of the best descriptions of SQL Server Windows registry setting WinsockListenBacklog.

Description of TCP/IP settings that you may have to adjust when SQL Server connection pooling is disabled
KB328476 - Description of TCP/IP settings that you may have to adjust when SQL Server connection pooling is disabled (microsoft.com) 


Check the relevant products in the "applies to section".

SQL Server 2005 editions are listed.  Does that mean that registry setting WinsockListenBacklog should be created for SQL Server 2005 and later?  Could that resolve some of the TCP connection failures under stress for SQL Server 2008 or 2008 R2?

Nope.

The support article "applies" to SQL Server 2005 because it provides the following information:
"Starting in SQL Server 2005, the network library passes a value of SOMAXCONN as the backlog setting to the listen API. SOMAXCONN allows the Winsock provider to set a maximum reasonable value for this setting. Therefore, the WinsockListenBacklog registry key is no longer used or needed in SQL Server 2005."

So, no more WinsockListenBacklog for SQL Server 2005 and beyond.

The clever reader will remember the dynamicbacklog family of registry values.  Full description in this support article:
Internet server unavailable because of malicious SYN attacks
http://support.microsoft.com/kb/142641
[update 2020 December 21 - it's a dead link now and I don't have a replacement yet :-( ]

So... should those be set for SQL Server 2005 and beyond?  Not if the OS is Windows Server 2008 or beyond.  They won't have any effect.  This is explained here.
Where have those AFD driver related registry (DynamicBacklogGrowthDelta / EnableDynamicBacklog ...) keys gone?
http://blogs.technet.com/b/nettracer/archive/2010/08/11/where-have-those-afd-driver-related-registry-dynamicbackloggrowthdelta-enabledynamicbacklog-maximumdynamicbacklog-minimumdynamicbacklog-keys-gone.aspx
[update 2020 December 21 - it's a dead link now and I don't have a replacement yet :-( ]


OK.  So... if there are connection failures under concurrent TCP connection stress with SQL Server 2008 or later, on Windows 2008 or later... whatfor to do?  Whither away the salve to soothe?

I'll have to pick that up later :)  As far as I can tell right now, the correctives on the most recent versions of SQL Server and Windows Server OS have to be about servicing the connection requests faster (clear the way for the nonpaged memory pool, make sure there are enough TCP connection memory blocks, take care of any other memory pressure conditions, consider binding the NIC to cores, utilize RSS scaling and potentially increase the number of cores servicing the NIC, etc).

More details as I learn, especially if I find specific correctives or diagnostics... wanted to get this out there so folks don't waste time with registry changes once valued but now merely "lodged with me useless".  My apologies to Milton.  

2 comments:

  1. Just out of curiosity, have you considered using procmon to see if later versions of SqlServer.exe read the registry keys registered mentioned in the older KB articles?

    ReplyDelete
  2. Nope. Would be a great step if my current concern was connection rejection :) As it happens, I see a small number of unexpected connection rejection and authentication failures occasionally. I'll probably loop back in a few months to try to stamp them out. This post is a sidetrack from researching FCoE failure/diagnosis/correction routes for sustained high disk IO SQL Server workflows. Spent time today reading through 10gbE CNA driver and firmware release notes... looking for TCP timer errors, false-checksum-error packet drops, diagnostics for queue full events... and that's where I sidetracked to Windows registry backlog settings.

    ReplyDelete