I'll stress that I use the -P<n> startup option for brief, targeted experimentation and observation only, on my laptop or in test instances. I've never run a perf/scale workload test against an instance with scheduler count modified this way, and I don't plan to do so. I'll plead with you instead to never let this startup option get anywhere near a production SQL Server instance, or an instance with workloads of any import. My stance and strategy is currently that its better to limit the "instruction entrance liability" of any one physical core and instead increase instruction intensity by tuning the workload and queries.
*****
The undocumented SQL Server -P<n> startup option (/P<n> if used in NET START command) instructs SQL Server to create <n> schedulers. Until I tried it and started poking around, I thought the SQLOS schedulers would just be stacked on top of otherwise-existing SQLOS cpu, node, memory node, and processor group structures. I expected that on my laptop I'd be restricted to a maximum of 64 schedulers since at the Windows OS level my laptop has a single processor group. My main goals with these tests were to probe autosoftNUMA behavior at different scheduler counts-per-node and transaction log writer assignment at different node counts.
I learned a few things along the way. 😂😂😂
First, I created two tables in the [master] database.
USE MASTER; GO CREATE TABLE system_details (sqlserver_start_time DATETIME, scheduler_count INT, cpu_count INT, socket_count INT, numa_node_count INT, softnuma_configuration_desc NVARCHAR(50), lgwr_count INT); GO CREATE TABLE node_details (sqlserver_start_time DATETIME, node_id INT, node_state_desc NVARCHAR(50), memory_node_id INT, processor_group INT, online_scheduler_count INT); GO
Next, a stored procedure in the [master] database.
USE MASTER; GO CREATE OR ALTER PROCEDURE startup_sys_details AS ;WITH txwr AS (SELECT lgwr_count = COUNT(*)
FROM sys.dm_exec_requests req
WHERE req.command = 'LOG WRITER') INSERT INTO system_details SELECT osi.sqlserver_start_time, osi.scheduler_count, osi.cpu_count,
osi.socket_count, osi.numa_node_count, osi.softnuma_configuration_desc,
txwr.lgwr_count FROM sys.dm_os_sys_info osi CROSS JOIN txwr; INSERT INTO node_details SELECT osi.sqlserver_start_time, osn.node_id, osn.node_state_desc,
osn.memory_node_id, osn.processor_group, osn.online_scheduler_count FROM sys.dm_os_nodes osn CROSS JOIN sys.dm_os_sys_info osi WHERE osn.node_id < 64; GO
Then, let's assign that stored procedure as an auto-run startup stored procedure.
USE MASTER; GO EXEC sp_procoption 'startup_sys_details', 'STARTUP', 'ON'; GO
Add a .bat script with NET STOP and NET START commands for the SQL Server service, plus a sqlcmd command to turn the auto-start stored procedure back off after the desired tests, and all the pieces are available.
NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P8 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P16 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P24 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P28 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P32 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P36 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P40 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P44 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P48 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P56 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P62 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P64 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P72 NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019 /P80 sqlcmd -S localhost\CTP_3_0__2019 -d master -Q "EXEC sp_procoption 'startup_sys_details', 'STARTUP', 'OFF';" NET STOP MSSQL$CTP_3_0__2019 NET START MSSQL$CTP_3_0__2019
So, execute my batch script. Then let's see the results.
SELECT start_number = ROW_NUMBER() OVER (ORDER BY sd.sqlserver_start_time), sd.* FROM system_details sd ORDER BY sd.sqlserver_start_time;
I've highlighted some minor surprises above in red. First, at 16 schedulers I expected auto-softNUMA to be active, but it wasn't. The other two red boxes highlight scenarios with fewer than the 8 transaction log writers I expected. I don't plan to investigate these percularities at this time.
;WITH grp AS (SELECT start_number = ROW_NUMBER() OVER (ORDER BY nd.sqlserver_start_time), nd.sqlserver_start_time FROM node_details nd GROUP BY nd.sqlserver_start_time) SELECT grp.start_number, grp.sqlserver_start_time, nd.processor_group, nd.node_state_desc,
node_count = COUNT(DISTINCT nd.node_id), nd.online_scheduler_count FROM node_details nd JOIN grp ON nd.sqlserver_start_time = grp.sqlserver_start_time GROUP BY grp.start_number, grp.sqlserver_start_time,
nd.processor_group, nd.node_state_desc, nd.online_scheduler_count ORDER BY grp.start_number, grp.sqlserver_start_time,
nd.processor_group, nd.node_state_desc;
Not as much surprise in the node details in this testing. It may be surprising that in order to accomodate the request for more than 64 schedulers, SQLOS created another processor group structure to hold the schedulers even though the OS still has a single processor group. Perhaps its also surprising that while the Windows OS now tries to balance logical CPU count in processor groups, SQLOS is content to keep processor groups "node-aligned" but uneven.
Because my intent in these tests was to probe transaction log writer count and auto softNUMA behavior, that's as far as I need to go with this. I don't expect to ever increase the scheduler count this way, so in my configurations processor groups should always be following the OS processor groups (unless I decide to do some future experiements similar to these.)
*If* I see a VM or physical server with SQL Server 2019 that has fewer transaction log writers than twice the soft\NUMA node count (up to 8), I'll investigate at that time.
That's all for now!
Ciao!!
Q
No comments:
Post a Comment