Monday, June 24, 2019

SQL Server 2019 CTP 3.0 plus P startup option (for experimentation only)

*****
I'll stress that I use the -P<n> startup option for brief, targeted experimentation and observation only, on my laptop or in test instances.  I've never run a perf/scale workload test against an instance with scheduler count modified this way, and I don't plan to do so.  I'll plead with you instead to never let this startup option get anywhere near a production SQL Server instance, or an instance with workloads of any import. My stance and strategy is currently that its better to limit the "instruction entrance liability" of any one physical core and instead increase instruction intensity by tuning the workload and queries.
***** 
 
The undocumented SQL Server -P<n> startup option (/P<n> if used in NET START command) instructs SQL Server to create <n> schedulers.  Until I tried it and started poking around, I thought the SQLOS schedulers would just be stacked on top of otherwise-existing SQLOS cpu, node, memory node, and processor group structures.  I expected that on my laptop I'd be restricted to a maximum of 64 schedulers since at the Windows OS level my laptop has a single processor group.  My main goals with these tests were to probe autosoftNUMA behavior at different scheduler counts-per-node and transaction log writer assignment at different node counts.

I learned a few things along the way. 😂😂😂

First, I created two tables in the [master] database.


USE MASTER;
GO

CREATE TABLE system_details
(sqlserver_start_time        DATETIME,
 scheduler_count             INT,
 cpu_count                   INT,
 socket_count                INT,
 numa_node_count             INT,
 softnuma_configuration_desc NVARCHAR(50),
 lgwr_count                  INT);
GO

CREATE TABLE node_details
(sqlserver_start_time   DATETIME,
 node_id                INT,
 node_state_desc        NVARCHAR(50),
 memory_node_id         INT,
 processor_group        INT,
 online_scheduler_count INT);
GO

Next, a stored procedure in the [master] database.

USE MASTER;
GO

CREATE OR ALTER PROCEDURE startup_sys_details
AS

;WITH txwr AS (SELECT lgwr_count = COUNT(*)  
               FROM sys.dm_exec_requests req  
               WHERE req.command = 'LOG WRITER')
INSERT INTO system_details
SELECT osi.sqlserver_start_time, osi.scheduler_count, osi.cpu_count,
osi.socket_count, osi.numa_node_count, osi.softnuma_configuration_desc,
txwr.lgwr_count 
FROM sys.dm_os_sys_info osi
CROSS JOIN txwr;

INSERT INTO node_details
SELECT osi.sqlserver_start_time, osn.node_id, osn.node_state_desc, 
osn.memory_node_id, osn.processor_group, osn.online_scheduler_count
FROM sys.dm_os_nodes osn
CROSS JOIN sys.dm_os_sys_info osi
WHERE osn.node_id < 64;
GO

Then, let's assign that stored procedure as an auto-run startup stored procedure.

USE MASTER;
GO
EXEC sp_procoption 'startup_sys_details', 'STARTUP', 'ON';
GO

Add a .bat script with NET STOP and NET START commands for the SQL Server service, plus a sqlcmd command to turn the auto-start stored procedure back off after the desired tests, and all the pieces are available.


NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P8
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P16
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P24
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P28
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P32
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P36
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P40
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P44
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P48
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P56
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P62
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P64
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P72
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019 /P80
sqlcmd -S localhost\CTP_3_0__2019 -d master -Q "EXEC sp_procoption 'startup_sys_details', 'STARTUP', 'OFF';"
NET STOP MSSQL$CTP_3_0__2019
NET START MSSQL$CTP_3_0__2019


So, execute my batch script.  Then let's see the results.


SELECT start_number = ROW_NUMBER() OVER (ORDER BY sd.sqlserver_start_time), sd.* 
FROM system_details sd
ORDER BY sd.sqlserver_start_time;


I've highlighted some minor surprises above in red.  First, at 16 schedulers I expected auto-softNUMA to be active, but it wasn't.  The other two red boxes highlight scenarios with fewer than the 8 transaction log writers I expected.  I don't plan to investigate these percularities at this time.


;WITH grp AS
(SELECT start_number = ROW_NUMBER() OVER (ORDER BY nd.sqlserver_start_time), nd.sqlserver_start_time
FROM node_details nd
GROUP BY nd.sqlserver_start_time)
SELECT grp.start_number, grp.sqlserver_start_time, nd.processor_group, nd.node_state_desc,
       node_count = COUNT(DISTINCT nd.node_id), nd.online_scheduler_count
FROM node_details nd
JOIN grp ON nd.sqlserver_start_time = grp.sqlserver_start_time
GROUP BY grp.start_number, grp.sqlserver_start_time, 
         nd.processor_group, nd.node_state_desc, nd.online_scheduler_count
ORDER BY grp.start_number, grp.sqlserver_start_time,
         nd.processor_group, nd.node_state_desc;



Not as much surprise in the node details in this testing.  It may be surprising that in order to accomodate the request for more than 64 schedulers, SQLOS created another processor group structure to hold the schedulers even though the OS still has a single processor group.  Perhaps its also surprising that while the Windows OS now tries to balance logical CPU count in processor groups, SQLOS is content to keep processor groups "node-aligned" but uneven.

Because my intent in these tests was to probe transaction log writer count and auto softNUMA behavior, that's as far as I need to go with this.  I don't expect to ever increase the scheduler count this way, so in my configurations processor groups should always be following the OS processor groups (unless I decide to do some future experiements similar to these.)

*If* I see a VM or physical server with SQL Server 2019 that has fewer transaction log writers than twice the soft\NUMA node count (up to 8), I'll investigate at that time.

That's all for now!

Ciao!!

Q

No comments:

Post a Comment