OK. I like logging a lot of information for my tests.
So I created two-three logging tables for perf stat captures :-)
DROP TABLE IF EXISTS count_reg_2; CREATE TABLE count_reg_2 (test_start DATETIME, session_id INT, login_time DATETIME, scheduler_id INT, cpu_id INT, parent_node_id INT, memory_node_id INT, elapsed_ms INT, iter BIGINT, cpu_ms BIGINT, session_end DATETIME) WITH (data_compression = page); DROP TABLE IF EXISTS count_session_waits_2; CREATE TABLE count_session_waits_2 (test_start DATETIME, session_id INT, wait_type NVARCHAR(256), waiting_tasks_count BIGINT, wait_time_ms BIGINT, max_wait_time_ms BIGINT, signal_wait_time_ms BIGINT) WITH (data_compression = page); DROP TABLE IF EXISTS count_system_waits_2; CREATE TABLE count_system_waits_2 (capture_tm DATETIME, wait_type NVARCHAR(256), waiting_tasks_count BIGINT, wait_time_ms BIGINT, max_wait_time_ms BIGINT, signal_wait_time_ms BIGINT) WITH (data_compression=page);
And I've got a stored procedure for my test workload. This is after a few revisions - there's a couple of important comments in there that maybe someday I'll be able to explain more adequately :-) Why count to 8000000 in the loop by default? It was an early estimate for 10 seconds of solid CPU work on this system, and I wasn't motivated enough to bump it higher once I saw that it fell a little short once NOCOUNT was ON. But I don't want to spill the beans early.
So... here's the stored procedure.
CREATE OR ALTER PROCEDURE timed_count__reg_2 @maxcount INT = 8000000, @test_start DATETIME, @NOCOUNT VARCHAR(10) = 'OFF', @syswaits VARCHAR(10) = 'OFF' AS DECLARE @start_time DATETIME, @end_time DATETIME, @iter INT = 0; WAITFOR DELAY '00:00:01'; -- session waits don't always flush for this test!?! more reliably flush with a WAITFOR IF @NOCOUNT = 'ON' SET NOCOUNT ON; SET @start_time = GETDATE(); IF @syswaits = 'ON' INSERT INTO count_system_waits_2 SELECT @start_time, ws.* FROM sys.dm_os_wait_stats ws WHERE ws.waiting_tasks_count > 0; WHILE @iter < @maxcount SET @iter = @iter + 1; SET @end_time = GETDATE(); IF @syswaits = 'ON' INSERT INTO count_system_waits_2 SELECT @end_time, ws.* FROM sys.dm_os_wait_stats ws WHERE ws.waiting_tasks_count > 0 OPTION (MAXDOP 1); INSERT INTO count_reg_2 SELECT @test_start, @@spid, sess.login_time, req.scheduler_id, sch.cpu_id, sch.parent_node_id, nde.memory_node_id, DATEDIFF(ms, @start_time, @end_time), @iter, sess.cpu, @end_time FROM sys.dm_exec_requests req JOIN sys.dm_os_schedulers sch ON req.scheduler_id = sch.scheduler_id JOIN sys.dm_os_nodes nde ON sch.parent_node_id = nde.node_id JOIN sys.sysprocesses sess ON req.session_id = sess.spid --hey!! important to use sysprocesses here rather than exec_sessions, better reporting of cpu time WHERE req.session_id = @@spid OPTION (MAXDOP 1); INSERT INTO count_session_waits_2 SELECT @test_start, @@spid, sws.wait_type, sws.waiting_tasks_count, sws.wait_time_ms, sws.max_wait_time_ms, sws.signal_wait_time_ms FROM sys.dm_exec_session_wait_stats sws WHERE sws.session_id = @@spid OPTION (MAXDOP 1); --sql_sasquatch
OK. Now Aaron Bertrand has a post from February 2016...
Performance Surprises and Assumptions : SET NOCOUNT ON
https://sqlperformance.com/2016/02/t-sql-queries/nocount
In that blog post the potential performance benefit of NOCOUNT ON was elusive. And, to be honest, this round of my tests was NOT to learn about NOCOUNT but other stuff. Just happened to stumble on NOCOUNT when I was sifting through stuff after some early testing. But with these results in hand, maybe NOCOUNT has a few more performance surprise for all of us :-)
OK, still need a driver for the tests. I'm old as the dinosaurs so I usually throw together a .bat script before Powershell.
I put together a pair of .bat scripts, one for NOCOUNT OFF (default) and one for NOCOUNT ON. These scripts resided on my laptop, and will be executed from my laptop, with a remote connection to SQL Server via sqlcmd. They looked like this.
nocount_off__test.bat
set date1=%date:~4% set date2=%date1:/=-% set datetime='%date2% %time%' START /B sqlcmd -dSASQUATCH -SMountain_home -Q "EXEC timed_count__reg_2 8000000, %datetime%, 'OFF', 'ON';" > nul
nocount_on__test.bat
set date1=%date:~4% set date2=%date1:/=-% set datetime='%date2% %time%' START /B sqlcmd -dSASQUATCH -SMountain_home -Q "EXEC timed_count__reg_2 8000000, %datetime%, 'ON', 'ON';" > nul
The timed_count__reg_2 stored procedure does a fair bit of self-logging, but whenever possible I like to keep eyes-on from outside the individual test sessions as well. Plus, I really like perfmon.
So, on the SQL Server RDBMS VM I have a Sasquatch directory. Its got a .bat script for starting logman and a text file for the counters to log. And a .bat script to stop the counter collection. For this type of test, I typically use a one second collection interval.
Fire up the Sasquatch collector. First executed the script with NOCOUNT OFF. Then executed the script with NOCOUNT ON. Stopped logman. Let's see what I've got.
I'm glad that worked out. The first test was with NOCOUNT OFF. It ran about 50 seconds. The second test was with NOCOUNT ON. It ran really hot on the CPU, in less than 10 seconds. There's something goofy about the CPU vs elapsed numbers for that particular run... that'll have to be for another day.
Notice the difference in the first test between the end ASYNC_NETWORK_IO waiting_tasks_count and the beginning count: 194332 - 186696 = 7636
That's exactly the ASYNC_NETWORK_IO waiting_tasks_count for the first test's session_id 55.
Same with wait_time_ms and signal_wait_time_ms. Nifty :-)
And no ASYNC_NETWORK_IO waits recorded during the 2nd test with NOCOUNT ON. Not at session level, not at system level.
So NOCOUNT made a big difference here. Huh.
Well - can perfmon tell us anything in addition to what we've already seen? Yep. First of all, today was the day I learned that the perfmon wait counters only update every 5 seconds :-). But the other thing perfmon shows us is that the ASYNC_NETWORK_IO waits that occur while NOCOUNT is OFF occur throughout the WHILE LOOP. And... the total wait time isn't determined by the work being done, but by network latency from SQL Server to the client. If the same test is run within the VM, a small amount of ASYNC_NETWORK_IO wait occurrences and total wait time might be observed. But it'll be far less than what I show in these tests. And... this particle test run was really tame. Earlier in the week I ran a similar test from same laptop, same VM and saw the variance from under 10 seconds to over 10 minutes. Gotta go for now though...