Friday, April 17, 2020

SQL Server 2017 cu17 Columnstore Workload OOMs Part 2: Whoever Smelt It, Dealt It?

This blog post is under construction...

This blog post is part of a series started yesterday with the post linked immediately below.

SQL Server 2017 cu17 ColumnStore Workload OOMs Part 1
https://sql-sasquatch.blogspot.com/2020/04/sql-server-2017-cu17-columnstore.html



Error investigation can be informed by one of two paradigms:
I think of the first paradigm as "whoever smelt it, dealt it."
The second paradigm is "the bearer of bad news."
Sometimes to reach the correct conclusion, the events and surrounding timeline must be examined from both viewpoints.

Whoever Smelt It, Dealt It

This article provided some needed laughter yesterday when I read it.  Its not directly relevant to the matter at hand... but its worth a chuckle.


We Asked Scientists Whether He Who Smelt It Really Dealt It

https://www.vice.com/en_us/article/ypa5x5/we-asked-scientists-whether-he-who-smelt-it-really-dealt-it

What the heck does this sophomoric idea have to do with OOMs, or any error type?

Consider this simplistic example.  The error is a direct result of the action by this session.  This session's action alone was sufficient to warrant the "divide by zero" error message.
This session smelt the error message, because this session dealt the divide by zero.


As it relates to OOMs, consider the OOM described in this blog post.
SQL Server 2019 Scalar UDF inlining - OOM in some cases
https://sql-sasquatch.blogspot.com/2019/11/sql-server-2019-udf-inlining-oom-in.html

A single session in isolation executing a query with an ugly UDF that triggers UDF inlining is sufficient to consume enough [Optimizer Memory] to result in the OOMs described in the post.  Yes, its a (fixed in CU2) bug, but the activity by that individual session realizes the risk presented by the bug.  That session  smelt it by way of the OOM, after that session dealt it by using an ugly inlined UDF.

OK.  Now let's focus on the OOMs I'm currently dealing with.  In SQL Server, if an OOM can be properly described as "whoever smelt it, dealt it" the memory activity must be attributable to the session that received the error and only that session.

One way for that to be the case is for that session to be running in isolation - no other user sessions on the instance.  That's not the case in observations of the OOMs I'm concerned with.  Each occurrence of these OOMs happens to be when there are multiple concurrent user sessions.

Another way for the memory activity to be attributable to the specific session that received the error is if the memory type is specifically and solely allocated to the session.  Connection memory works like that.  Connection memory is within [Total Server Memory] but individual sessions have private access to small chunks of it.  Optimizer memory works that way, too.  So, too, does the used portion of query memory grants.  All of the memory in [Total Server Memory] that can be individually attributed to sessions is within [Stolen Server Memory].  But not all [Stolen Server Memory] can be individually attributed to sessions.  (For example, consider the plan cache within [Stolen Server Memory].  Although an individual session inserts a plan into cache, while the plan is cached other sessions can use it.  And a cached plan can stay in cache after the session that originally inserted it has ended.)

It just so happens I have some graphs. Each of the red vertical lines below is an OOM.  Usually in a graph like this I have [Stolen Server Memory] at the top of the stacked graph, with [Free Memory] and [Database Cache Memory] below it.  Like this...



But since I want to focus on [Stolen Server Memory] I want it at the bottom of the stack for now, like this...



In the workloads I am concerned with, the largest portion of individually attributable [Stolen Server Memory] is the amount of granted query memory that is used for sort/hash/columnstore compression at any given time.  If all user workload groups are associated with the Default Resource Pool, that amount is [\SQLServer:Memory Manager\Granted Workspace Memory (KB)] - [\SQLServer:Memory Manager\Reserved Server Memory (KB)].  If user Resource Pools other than Default during the timeperiod of concern, the information should be gathered from sys.dm_exec_query_memory_grants and/or sys.dm_exec_query_resource_semaphores to account for the total granted and total reserved memory.

Fortunately for me, on this day the only resource pools present were Default and Internal.  A little bit easier path.

The next graph is the amount of granted memory (not necessarily already used), with [\SQLServer:Memory Manger\Total Server memory (KB)] included on the same graph for scale.

First of all, I'll point out that the total granted memory is not very high compared to the target server memory. Also, the yellow box indicates high points for granted memory that occurred without errors, while errors occurred later with lower levels of granted memory.


Let's zoom in a little for better visibility.  This graph is granted memory - but it doesn't indicate how much of the granted memory is used.



Because on the Default resource pool was in play, layering the reserved memory in front of granted memory gives an idea of the used query memory.  In the graph below, the exposed dark green is the portion of [\SQLServer:Memory Manager\Granted Workspace Memory (KB)] which is used. 



well, well...



well, well....






Date,Source,Severity,Message
12/21/2019 03:15:35,spid83,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:17:51,spid79,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:24:14,spid51,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:36:41,spid73,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:43:01,spid89,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:47:00,spid55,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670319<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:50:43,spid57,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34670740<nl/>Simulated                                    166692<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                    0
12/21/2019 03:54:08,spid73,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639426<nl/>Simulated                                    250762<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 03:56:07,spid73,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639426<nl/>Simulated                                    250762<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 03:58:37,spid73,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639426<nl/>Simulated                                    250762<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:00:43,spid52,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639426<nl/>Simulated                                    250762<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:03:27,spid70,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639437<nl/>Simulated                                    250718<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:05:45,spid74,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639437<nl/>Simulated                                    250718<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:09:38,spid70,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      34639440<nl/>Simulated                                    250706<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:15:57,spid107,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      21670985<nl/>Simulated                                   2875695<nl/>Simulation Benefit                                0<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070
12/21/2019 04:23:20,spid94,Unknown,Memory Broker Clerk (Column store object pool)      Pages<nl/>---------------------------------------- ----------<nl/>Total                                      21660293<nl/>Simulated                                   2892211<nl/>Simulation Benefit                       0.0000000000<nl/>Internal Benefit                                  0<nl/>External Benefit                                  0<nl/>Value Of Memory                                   0<nl/>Periodic Freed                                    0<nl/>Internal Freed                                84070

well, well...


Memory Broker Clerk (Column store object pool)
Time               Pages            kb
03:15:35           34670319         277362552
03:17:51           34670319         277362552
03:24:14           34670319         277362552
03:36:41           34670319         277362552
03:43:01           34670319         277362552
03:47:00           34670319         277362552
03:50:43           34670740         277365920
03:54:08           34639426         277115408
03:56:07           34639426         277115408
03:58:37           34639426         277115408
04:00:43           34639426         277115408
04:03:27           34639437         277115496
04:05:45           34639437         277115496
04:09:38           34639440         277115520
04:15:57           21670985         173367880
04:23:20           21660293         173282344

a haw haw haw...

Thursday, April 16, 2020

SQL Server 2017 cu17 ColumnStore Workload OOMs Part 1

A support ticket concerning this behavior is open with Microsoft.  SQL Server 2017 CU20 has not been evaluated for this behavior.  Although no published fix included in CU20 appears to address this behavior, it's possible additional affects of the documented fixes address the behavior.  Its also possible an undocumented CU20 fix addresses this behavior.  However, as CU20 has never been brought up in the context of the support ticket, and - as can be seen at the end of this blog post - Microsoft has previously documented fixes to similar behavior, I currently believe this behavior is likely to exist in SQL Server 2017 CU20 as well as CU17, CU18, and CU19.

Let's start the story with a SQL Server 2017 CU17 service restart on December 20, 2019.

All of the graphs below unless otherwise noted are from perfmon, captured in 30 second increments.

It's a VMware VM, with Windows Server 2016 as the operating system. Global startup trace flags 4199 (optimizer hotfixes), 4139 (quickstats histogram amendments), 3226 (no errorlog messages for successful t-log backups). Nothing too surprising there.

This vm has 56 vcpu and 620 GB vRAM.

[Max Server Memory] is set to 590000 mb.


12/20/2019 11:03:24,spid8s,Unknown,SQL Server shutdown has been initiated      
12/20/2019 11:03:24,spid8s,Unknown,SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.      
12/20/2019 11:03:25,Server,Unknown,Microsoft SQL Server 2017 (RTM-CU17) (KB4515579) - 14.0.3238.1 (X64) <nl/> Sep 13 2019 15:49:57 <nl/> Copyright (C) 2017 Microsoft Corporation<nl/> Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2016 Standard 10.0 <X64> (Build 14393: ) (Hypervisor)   
12/20/2019 11:03:25,Server,Unknown,UTC adjustment: -6:00      
12/20/2019 11:03:25,Server,Unknown,(c) Microsoft Corporation.      
12/20/2019 11:03:25,Server,Unknown,All rights reserved.      
12/20/2019 11:03:25,Server,Unknown,Server process ID is 12764.      
12/20/2019 11:03:25,Server,Unknown,System Manufacturer: 'VMware<c/> Inc.'<c/> System Model: 'VMware Virtual Platform'.      
12/20/2019 11:03:25,Server,Unknown,Authentication mode is MIXED.      
12/20/2019 11:03:25,Server,Unknown,Logging SQL Server messages in file 'C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Log\ERRORLOG'.      
12/20/2019 11:03:25,Server,Unknown,The service account is '***redacted***'. This is an informational message; no user action is required.      
12/20/2019 11:03:25,Server,Unknown,Registry startup parameters: <nl/>  -d C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\DATA\master.mdf<nl/>  -e C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Log\ERRORLOG<nl/>  -l C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\DATA\mastlog.ldf<nl/>  -T 4199<nl/>  -T 4139<nl/>  -T 3226
12/20/2019 11:03:25,Server,Unknown,Command Line Startup Parameters:<nl/>  -s "MSSQLSERVER"
12/20/2019 11:03:25,Server,Unknown,SQL Server detected 28 sockets with 2 cores per socket and 2 logical processors per socket<c/> 56 total logical processors; using 56 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

When [Max Server Memory] can be attained by SQL Server, [Target Server Memory] will be equal to [Max Server Memory].  If memory conditions external to SQL Server make [Max Server Memory] unattainable, [Target Server Memory] will be adjusted downward to an attainable value. As workload is placed on the system, [Total  Server Memory] grows toward [Target Server Memory].  That's typical, expected behavior.

In this case, the story stays boring until after the following midnight.  There wasn't enough workload to drive much growth of [Total Server Memory] until about 2:15 am, after which [Total Server Memory] grew fairly rapidly.  [Total Server Memory] reached a plateau between 3:00 am and 3:15 am, and then stepped slightly down later.  [Target Server Memory] was never attained.  That's curious.


[Total Server Memory] is the sum of [Database Cache Memory], [Free Memory], and [Stolen Memory] for the instance.  Here's a graph of those three memory categories over the same timeperiod.


It really is quite curious that [Total Server Memory] never reached [Target Server Memory] and in fact eventually backed *farther off* from [Target Server Memory].  If external factors made [Target Server Memory] unattainable, I expect [Target Server Memory] to be adjusted downward.  But that didn't happen.

I'm a very curious guy, but I don't spend days worth of time creating and staring at graphs of the same short timeperiod simply out of curiosity.  This morning timeperiod encompassed 16 SQL Server out-of-memory errors.  I want to be able to diagnose them, and find out when the contributing behavior was introduced, in order to advise organizations that are planning SQL Server upgrades - or that fall prey to the type of OOMs experienced in this instance.

I'm going to tighten the timeframe for the remaining graphs to 2:30 am to 5:00 am, in order for the individual errors to be discernable on the graphs.

There are the 16 OOMs, each marked with a vertical red line in the graph below.  Interestingly, they occur during the plateau of [Total Server Memory].  But... SQL Server is still a long way off from reaching [Target Server Memory]... why not just grow SQL Server's memory share?

As I mentioned before, if external memory conditions made [Target Server Memory] unattainable, I expect [Target Server Memory] to be lowered.  That didn't happen, hinting that sufficient memory was still present for [Total Server Memory] to grow.

What does perfmon have to say about [Available Memory]?  Does it corroborate my conclusion that SQL Server *could have* continued to grow [Total Server Memory]?  Sure 'nuf.  At the time of *all 16* of the OOMs, there was over 100GB of [Available Memory], nicely divided across the 2 vNUMA nodes.

Those familiar with my memory graphs on this blog and in my twitter activity as @sqL_handLe may note that I'm usually much more preoccupied with [Free & Zero Page List Memory] than with [Available Memory].  Depending on the memory consumers and activity on a VM, there may be considerable memory accounted as [Available Memory] which is not [Free & Zero Page List Memory].  If a memory allocation for a consumer must be zeroed before handed over, it's gotta come from [Free & Zero Page List Memory] (which I often affectionately call FZPL).

Thankfully in this case, FZPL is nearly all of [Available Memory] and is similarly well-divided between the vNUMA nodes.



So [Available Memory] and [Free & Zero Page List Memory] are plentiful and balanced between the two vNUMA nodes.  Can we see how much of SQL Server [Total Server Memory] is on each of the SQLOS nodes?  Yup.  Meet my friend [Total Node Memory].  Equally balanced across the two SQLOS nodes.



In fact, the categories of [Database], [Stolen], and [Free] SQLOS memory can be tracked at SQLOS node level, too.  Pretty nice balance on nodes 000 and 001.



Extracted from the SQL Server errorlog, here are the 16 OOMs from that night.


12/21/2019 03:15:27,spid83,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:17:43,spid79,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 8
12/21/2019 03:24:02,spid51,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 8
12/21/2019 03:36:37,spid73,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:42:54,spid89,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:46:54,spid55,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:50:36,spid57,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:53:56,spid73,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:55:57,spid73,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 03:58:28,spid73,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 04:00:39,spid52,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 04:03:14,spid70,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 16
12/21/2019 04:05:34,spid74,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1
12/21/2019 04:09:26,spid70,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 24
12/21/2019 04:15:44,spid107,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 8
12/21/2019 04:23:14,spid94,Unknown,Failed to allocate BUFs: FAIL_BUFFER_ALLOCATION 1

So here's the summary of the story: 16 OOMs occurred in one night.  The instance hadn't reached [Target Server Memory], and there doesn't seem to be a good reason for [Total Server Memory] to *not* have grown rather than incur an OOM.  There was over 50GB of FZPL per vNUMA node, for goodness sake!  Memory use seems to have been quite evenly divided between the vNUMA nodes.

For the last two OOMs, at 4:15 am and 4:23 am, there was even a large amount of [Free] SQLOS memory within [Total Server Memory].  And there was plenty of [Free] SQLOS memory on either of the two SQLOS nodes!!  The possibility of SQLOS [Total] memory growth at the time of any of the 16 OOMs, and the additional presence of significant SQLOS [Free] memory on either SQLOS node during the last 2 OOMs fully convinces me that this is a bug.  (Well, that and not having seen this type of behavior in any of the SQL Server 2016 or 2019 versions I've worked with.)

When the dust clears from this, I think a bug similar to this one from SQL Server 2017 cu10 will be found.  I am upset at myself that in 5 months of staring down *this* behavior, I haven't been able to marshal the resources to get *this* bug properly identified so it can be corrected.

FIX: Out of memory error occurs even when there are many free pages in SQL Server
https://support.microsoft.com/en-us/help/4347088/fix-out-of-memory-error-when-there-are-free-pages-in-sql-server

The fix referenced above, kb4347088, is found in SQL Server 2017 cu10 and SQL Server 2016 sp2 cu3.  The problem described in this blog post shares some similarities, and has been observed on SQL Server 2017 cu17, cu18, and cu19.  As of 2020 April 16 this bug has not been identified and there is no fix imagined or planned.





Sunday, December 15, 2019

Interpreting SQL Server 2019 DBCC MEMORYSTATUS output -- Part 1

Recently someone asked me for help interpreting DBCC MEMORYSTATUS output.  And it's taken me too doggone long to start answering their questions.  Sorry - I'll start with this and keep going.

In response to certain OOM conditions, SQL Server will automatically log DBCC MEMORYSTATUS output to the error log.  Here I'll look at output from the command run via a task scheduler script on a 5 minute schedule.

One issue with the excerpt of the output below is the lack of units.

start Fri 12/13/2019 14:40:00.80 
Process/System Counts                                Value               
---------------------------------------------------- --------------------
Available Physical Memory                                    250846224384
Available Virtual Memory                                  136302950027264
Available Paging File                                        298663870464
Working Set                                                   13835464704
Percent of Committed Memory in WS                                      99
Page Faults                                                       6876485
System physical memory high                                             1
System physical memory low                                              0
Process physical memory low                                             0
Process virtual memory low                                              0
~~~~~
end   Fri 12/13/2019 14:40:06.62

Here's some info from perfmon.


Time 14:40:01 14:40:06 14:40:11
\NUMA Node Memory(_Total)\Total MBytes 952319 952319 952319
\NUMA Node Memory(_Total)\Free & Zero Page List MBytes 233635 233634 233634
\NUMA Node Memory(_Total)\Available MBytes 239222 239221 239229
\Paging File(_Total)\% Usage 0 0 0



MemoryStatus gives Available Physical Memory = 250846224384.  Assuming that number is specified in bytes, dividing by 1048576 yields  megabytes: 239225.6 mb. That's no more than 4.6 mb away from the 3 values for [\NUMA Node Memory(_Total)\Available MBytes] shown in the perfmon above.

Below "wmic pagefileset list" shows the lone pagefile with a size of 48000 mb. Perfmon above shows it at 0% used.



total_physical_memory_kb     =  975175156 kb = 930 gb
available_physical_memory_kb =  243809076 kb = 238095 mb
total_page_file_kb           = 1024327156 kb = 930 gb + 48000 mb
available_page_file_kb       =  289622644 kb = 234835 mb + 48000 mb




So what have we got?  MEMORYSTATUS [Available Physical Memory] measured in bytes appears to be the same resource as perfmon [\NUMA Node Memory(_Total)\Available MBytes] and sys.dm_os_sys_memory.available_physical_memory_kb.

MEMORYSTATUS [Available Paging File] measured in bytes appears to be free paging file plus MEMORYSTATUS [Available Physical Memory]. MEMORYSTATUS [Available Paging File] appears to be the same resource as sys.dm_os_sys_memory.available_page_file_kb.

What about MEMORYSTATUS [Available Virtual Memory]?  It seems so much higher than other measures.  Pulling in the MEMORYSTATUS value from above...


Available Virtual Memory = 136302950027264 B = 123.97 TB


Here's a similar number...






 total_virtual_address_space_kb = 137438953343 kb = 128 TB

The value from dm_os_process_memory was retrieved several days after the other values in this blog post.  *And* after a SQL Server service restart.

This KB article explains that SQL Server process virtual address space starts at approximately 8TB in Windows Server 2012.  But it also mentions "Windows 2012 R2 [and above] allows virtual address space to grow as large as 128 TB".

kb3074434
FIX: Out of memory error when the virtual address space of the SQL Server process is very low on available memory
https://support.microsoft.com/en-us/help/3074434/


That's all for now, folks.



  







Tuesday, December 10, 2019

Renouncing the Symbol: 1940 Resolution by Tribes to Reject the Swastika or Fylfot

Each of the 10 articles or captioned photographs linked below refer to the same event - the signing of a resolution by members of Navajo, Papago, Apache, and Hopi tribes in early 1940.

I share links to these these articles and photographs to illustrate that over time content, art, and craft creators have been very serious about the current as well as historic associations of the symbols they use.

The text of the resolution as displayed by the photographs and articles below:
"Because the above ornament which has been a sign of friendship among our forefathers for many centuries has been desecrated recently by another nation of peoples,
Therefore it is resolved that henceforth from this date on and forever more our tribes renounce the use of the emblem commonly known today as the swastika or fylfot on our blankets, baskets, art objects, sandpaintings and clothing."

An additional note about two of the stories below: numbers 6 and 10 in the list seem to be in denial of these tribes - and the individuals themselves - to act of themselves in accordance with their principles.  I'm especially troubled by the tone of the article in number 6, and hope to write a follow-up post on that particular article in the future. 

1.
Indians Denounce Nazis, Forego Use of Swastika
St. Joseph Gazette, St. Joseph, Missouri
Monday, 26 February 1940
https://www.newspapers.com/clip/36417593/indians_denounce_swastika/

2.
Indian Sign on Swastika
The Akron Beacon Journal, Akron, Ohio
Monday, 4 March 1940
https://www.newspapers.com/clip/36417663/indians_denounce_swastika_2/

3.
The Evening Review, East Liverpool, Ohio
Thursday, 29 February 1940
https://www.newspapers.com/clip/36417770/indians_denounce_swastika_3/

4.
Star-Gazette, Elmira, New York
Thursday, 29 February 1940 
https://www.newspapers.com/clip/36417960/indians_denounce_swastika_4/

5.
Indians Bar Swastika Design as Protest Against Nazis
The Los Angeles Times, Los Angeles, California
Monday, 26 February 1940
https://www.newspapers.com/clip/36418025/indians_denounce_swastika_5/

6.
Arizona Indians Bow to Hitler
The Twin Falls News, Twin Falls, Idaho
8 March 1940 
https://www.newspapers.com/clip/36418064/indians_denounce_swastika_6/

7.
Four Indian Tribes Unite in Order Barring Swastika as Their Symbol
The San Bernardino County Sun, San Bernardino, California
Tuesday, 27 February 1940
https://www.newspapers.com/clip/36418096/indians_denounce_swastika_7/

8.
Indians Denounce Swastika Emblem
Wednesday, 28 February 1940
https://www.newspapers.com/clip/36418167/indians_denounce_swastika_8/

9.
Indians Ban Swastika
The Town Talk, Alexandria, Louisiana
Monday, 1 April 1940
https://www.newspapers.com/clip/36418274/indians_denounce_swastika_9/

10.
Put Indian Sign on Swastika
Times Colonist, Victoria, British Columbia, Canada
Saturday, 6 April 1940 
https://www.newspapers.com/clip/36418535/indians_denounce_swastika_10/

For proper placement in history:
7 December 1941 - United States enters World War II.


In addition to the links above, one may find duplicates appearing in other papers.

This article is a near duplicate of "Indians Denounce Nazis, Forego Use of Swastika" from above.

Indians Renounce Swastika Symbol
Albuquerque Journal, Albuquerque, New Mexico
Monday 26 February 1940
https://www.newspapers.com/clip/13127176/albuquerque_journal/

Tuesday, November 19, 2019

SQL Server 2019 Scalar UDF inlining - OOM in some cases

*** Update 2020 April 7 ***

The issue described below was corrected in kb4536075, with the fix for the scalar UDF inlining OOMs first available in SQL Server 2019 CU2.

FIX: Scalar UDF Inlining issues in SQL Server 2019
https://support.microsoft.com/en-us/help/4538581/fix-scalar-udf-inlining-issues-in-sql-server-2019
SQL Server 2019 CU2
https://support.microsoft.com/en-us/help/4536075/cumulative-update-2-for-sql-server-2019

*** end update ***



Here's a little something I stumbled across.  A caution about scalar UDF inlining.

Well, ok, maybe Joe Obbish stumbled across it first :-)

Today's adventure is on my laptop.  Because no lie this ugly UDF combined with current UDF inlining memory consumption will take down your server no matter *how* big it is.

Here's my laptop SQL Server version and some important database details.


OK, let's create a scalar UDF with a few logic branches.  The function is nonsense, I'm sorry.  But if you try this... you can try making it as sensible as you'd like. :-)

In the function below there is one IF, 25 ELSE IFs, and 1 ELSE.


SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
CREATE OR ALTER  FUNCTION dbo.test__inline_udf
(
  @in_param nvarchar(250)
)
RETURNS nvarchar(250)
AS
BEGIN
  DECLARE @retValue nvarchar(250) = @in_param
  IF @in_param = N'A'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)  
   BEGIN SET @retValue = N'1' END
  END
  ELSE IF @in_param = N'B'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17) 
    BEGIN SET @retValue = N'2' END
  END
  ELSE IF @in_param = N'C'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'3' END
  END
  ELSE IF @in_param = N'D'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
    BEGIN SET @retValue = N'4' END
  END 
  ELSE IF @in_param = N'E'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'5' END
  END
  ELSE IF @in_param = N'F'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'6' END
  END
  ELSE IF @in_param = N'G'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'7' END
  END
  ELSE IF @in_param = N'H'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'8' END
  END
  ELSE IF @in_param = N'I'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'9' END
  END
  ELSE IF @in_param = N'J'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'10' END
  END
  ELSE IF @in_param = N'K'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'11' END
  END
  ELSE IF @in_param = N'L'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'12' END
  END
  ELSE IF @in_param = N'M'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'13' END
  END
  ELSE IF @in_param = N'N'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'14' END
  END
  ELSE IF @in_param = N'O'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'15' END
  END
  ELSE IF @in_param = N'P'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'16' END
  END
  ELSE IF @in_param = N'Q'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'17' END
  END
  ELSE IF @in_param = N'R'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'18' END
  END
  ELSE IF @in_param = N'S'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'19' END
  END
  ELSE IF @in_param = N'T'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'20' END
  END
  ELSE IF @in_param = N'U'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'21' END
  END 
  ELSE IF @in_param = N'V'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'22' END
  END
  ELSE IF @in_param = N'W'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'23' END
  END
  ELSE IF @in_param = N'X'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'24' END
  END
  ELSE IF @in_param = N'Y'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'25' END
  END
  ELSE IF @in_param = N'Z'
  BEGIN
    IF @retValue NOT IN ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
       BEGIN SET @retValue = N'26' END
  END
  ELSE
  BEGIN SET @retValue = N'00' END
  RETURN @retValue
END


On my laptop, the following result took from 8 to 10 seconds repeatedly.



I'll tell you what.  I ran this same function on a VM with 930 GB vRAM, with Max Server Memory set to 690GB.  It ran for over 15 minutes before crashing.  Not just crashing itself.  Crashing any other query on the instance that was trying to allocate memory (eg stealing against a query memory grant).  It had amassed over 500 GB of optimizer memory at that point, like this...


Wow.  Now you'll notice that the big ol' server is a slightly different version.  No matter.  As far as I know, this behavior will be found on every version of  SQL Server 2019 up to date (today is 2019 November 19).

Once the instance reached the total limit for "stealable" memory, the query crashed.  Same thing if an estimated plan was requested - so its in compilation rather than execution that the aggressive memory consumption occurs.  Once the OOM occurs, the large amount of optimizer memory is freed within a few seconds and the instance is back to normal for all other purposes.

Now, if I disable UDF inlining... the following result comes in well under 1 second.


Here's the final thing I can say about this for now...
If you generate an estimated plan for a query that tries to inline that UDF, it'll also crash due to excessive optimizer memory*.

I'll update this blog post in the future when a fix is available.


* well, I speculate that there is some amount of memory which may be sufficient to allow this to complete with generating an OOM.  But once it's more than 500 GB does it really matter?


~~~~~

The scalar UDF used above is really, really bad :-)

So here's a nicer one.


SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
CREATE OR ALTER  FUNCTION dbo.test__inline_udf_2
(
  @in_param INT
)
RETURNS INT
AS
BEGIN
  DECLARE @retValue INT = @in_param
  IF @in_param = 1
  BEGIN
    IF @retValue IN 
  ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
  SET @retValue = 1
  END
  ELSE IF @in_param = 2
  BEGIN
    IF @retValue IN 
  ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
   SET @retValue = 2
   END
   ELSE IF @in_param = 3
  BEGIN
    IF @retValue IN 
  ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
   SET @retValue = 3
   END
   ELSE IF @in_param = 4
   BEGIN
     IF @retValue IN 
  ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
   SET @retValue = 4
   END
   ELSE IF @in_param = 5
   BEGIN
     IF @retValue IN 
  ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
   SET @retValue = 5
  END
  ELSE SET @retValue = 0
  RETURN @retValue
END

No more implicit conversions.  Took out some unnecessary BEGIN-END pairs.  Its only one IF, four ELSE IFs, and an ELSE.  Changed the NOT IN clauses to IN clauses.

And when i run it on my monster VM, it also generates an error.


That looks even more severe that the previous error.  But its the same underlying condition: working on the inlining of the scalar UDF during plan compile kept gobbling optimizer memory until something gave way.

Msg 701, Level 17, State 123, Line 6
There is insufficient system memory in resource pool 'default' to run this query.
Location:
Expression: false
SPID: 61
Process ID: 4916
Msg 3624, Level 20, State 1, Line 1
A system assertion check has failed.
Msg 596, Level 21, State 1, Line 0
Cannot continue the execution because the session is in the kill state.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command.
Froid MEMORY_CLERK_SQLOPTIMIZER

Monday, September 30, 2019

Observing the [DBCC CHECKDB] parallel object check

This is a query i use to keep tabs on all of the workers when i run [dbcc checkdb] at high DOP :-)

Note: this query won't catch the initial portions of checkdb that always run at DOP 1 - the checkalloc, checkcatalog, etc.


;WITH owt AS
(SELECT exec_context_id, wait_type, 
        MAX(wait_duration_ms) wait_duration_ms
 FROM sys.dm_os_waiting_tasks owt 
 GROUP BY exec_context_id, wait_type)
SELECT capture_tm = getdate(), owt.wait_type, count(*) waiters, 
       MIN(owt.wait_duration_ms) min_wait_ms, 
       MAX(owt.wait_duration_ms) max_wait_ms
FROM sys.dm_exec_requests req
JOIN sys.dm_os_tasks ot ON req.session_id = ot.session_id
LEFT OUTER JOIN owt
     ON ot.session_id = owt.session_id AND ot.exec_context_id = owt.exec_context_id
WHERE req.command IN ('dbcc table check') 
GROUP BY owt.wait_type
ORDER BY waiters DESC;

q

Thursday, September 19, 2019

SQL Server Unequal NUMA CPU engagement

With physical servers using NUMA configuration - or vms using vNUMA - a number of factors can lead to uneven CPU utilization among the NUMA nodes.

Consider a 2x8 vm, with SQL Server maxdop set at 8. Assume a single maxdop 8 parallel query running on that system. The parallel workers will most likely be sitting on 8 user schedulers. If that parallel query has multiple zones, the parallel workers will still most likely be on 8 user schedulers, with as many parallel workers for the query stacked on each of those schedulers as there are zones in the plan.

All of the schedulers hosting parallel workers for that query may be co-resident with each other in vNUMA node 0. Or 1. Or they may be spread across them in some manner.

The "single query with maxdop less than scheduler count" scenario gives a good starting point for understanding how CPU engagement of the NUMA nodes may be different at any given time.

(Location of context ID 0 for the session adds another detail to this scenario. It may be co-resident on a scheduler with some number of parallel workers for the query, or it may not. For example, if all parallel workers are in vNUMA node 0 and context ID 0 is also in vNUMA node 0 in this example, context ID 0 is co-resident on a scheduler with at least one parallel worker for the query.)

Imbalances can even arise if all queries are maxdop 1. SQL Server distributed incoming connections to an end-point among all nodes available to that end-point with a round-robin scheme. But! Due to use of ADO.NET connection pooling and a pre-existing pool, two successive connections from a client - in the same pool - may very well end up on the same node. The fairly fast ramp up of pools to their maximum size (by default 100) and fairly slow ramp down (documentation says an idle pool connection will be disposed after 4 to 8 minutes of idle time, down to minimum pool size) can lead to some unexpected distributions of connections.

Now, its even possible that an equal number of active workers are on each scheduler of each node, but the vNUMA nodes *still* show unequal CPU engagement. I'll just mention 1 query-based possibility, and one system state-based possibility.

What if the queries on node 1 are optimized CCI-based queries, while through some stroke of luck open to the author of hypothetical situations the queries on node 0 are heavily read-dependent rowstore queries?

The distribution of pageio_latch waits would lean toward Node 0 due to the operation pattern there(especially if readahead for some reason or other isn't engaged or is still so slow it produces waits). And those waits could suppress CPU engagement on Node 0, while the same level of effect would not impede engagement on node 1.

Now let's talk how system state could result in uneven CPU engagement across the vNUMA nodes, even if work is evenly distributed.
Various memory states can result in that pattern.
What if another application is running on the same VM? If its memory primarily comes from NUMA node 0, its memory management can come into conflict with SQL Server, which will try to balance its memory utilization across the NUMA nodes.

Perfmon counters [\NUMA Node Memory(*)\Free & Zero Page List MBytes] can be good to peruse if this is suspected. If only one of 2 nodes is flirting with bottoming out Free and Zero Page List memory, it can disproportionately suffer from memory stalls and memory-related SQL Server waits, as well as potentially suffering from content send to and retrieve from pagefile.sys.

OK, enough speculation from me. :-)

Here's a stored procedure that, if run in a fairly short interval like every 10 seconds, can give insight into uneven CPU utilization on a NUMA server based on information from within SQL Server.


CREATE PROCEDURE dbo.SASQ_GRAB
AS
IF object_id ('dbo.SASQ_SCHEDULER_TASKS') IS NULL
BEGIN
    CREATE TABLE dbo.SASQ_SCHEDULER_TASKS (
     capture_time DATETIME NOT NULL,
     scheduler_id INT NOT NULL,
     session_id SMALLINT NULL,
     wait_type NVARCHAR(60) NULL,
     task_count SMALLINT NULL
    );

    CREATE CLUSTERED INDEX CI ON dbo.SASQ_SCHEDULER_TASKS (capture_time) 
    WITH (data_compression = row);
END

IF object_id ('dbo.SASQ_NODE_CONNS') IS NULL
BEGIN
    CREATE TABLE dbo.SASQ_NODE_CONNS (
     capture_time DATETIME NOT NULL,
     node_affinity TINYINT NOT NULL,
     client_net_address VARCHAR(50) NULL,
     conn_count SMALLINT NULL
    );

    CREATE CLUSTERED INDEX CI ON dbo.SASQ_NODE_CONNS (capture_time)
    WITH (data_compression = row);
END

IF object_id ('dbo.SASQ_WAIT_STATS') IS NULL
BEGIN
    CREATE TABLE dbo.SASQ_WAIT_STATS (
     capture_time        DATETIME NOT NULL,
     wait_type           NVARCHAR(60) NOT NULL,
     waiting_tasks_count BIGINT NOT NULL,
     wait_time_ms        BIGINT NOT NULL,
     max_wait_time_ms    BIGINT NOT NULL,
     signal_wait_time_ms BIGINT NOT NULL
    );

    CREATE CLUSTERED INDEX CI ON dbo.SASQ_WAIT_STATS (capture_time)
    WITH (data_compression=row);
END

DECLARE @ct DATETIME;
SET @ct = GETDATE();

INSERT INTO dbo.SASQ_SCHEDULER_TASKS WITH (TABLOCK)
SELECT @ct, ot.scheduler_id, ot.session_id, owt.wait_type, task_count = count(*)
FROM sys.dm_os_tasks ot with (nolock)
LEFT OUTER JOIN sys.dm_os_waiting_tasks owt with (nolock) ON ot.task_address = owt.waiting_task_address
WHERE scheduler_ID < 1048576
GROUP BY ot.scheduler_id, ot.session_id, owt.wait_type
ORDER BY ot.scheduler_id ASC
OPTION (MAXDOP 1);

INSERT INTO dbo.SASQ_NODE_CONNS with (tablock)
SELECT @ct, node_affinity, client_net_address, conn_count = count(*) 
FROM SYS.dm_exec_connections with (nolock)
GROUP BY node_affinity, client_net_address
OPTION (MAXDOP 1);


INSERT INTO dbo.SASQ_WAIT_STATS with (tablock)
SELECT capture_time = GETDATE(), wait_type, waiting_tasks_count, wait_time_ms, max_wait_time_ms, signal_wait_time_ms
FROM sys.dm_os_wait_stats ws WITH (NOLOCK)
WHERE waiting_tasks_count <> 0
OPTION (MAXDOP 1);

/* 20190919
   grab some info from DMOs in iterations of eg 10s to understand
   discrepancy of cpu engagement between NUMA nodes
*/