Comments on sql.sasquatch: Multiple tempdb data files for performance - What could go wrong?

What do you mean by "high"? Maybe high ...

2017-08-25T23:16:44.603-07:00

What do you mean by "high"? Maybe high average sec/write?
Next question would be - what do you mean by healthy VM & SAN metrics?
IME most SANs provide metrics in a pretty wide observation window - maybe every 5 minutes. But in SQL Server, we typically pay attention to things in much smaller windows - I usually use a 30 second interval for perfmon, for example. Numbers that look bad in 30 second intervals can even out pretty well in 5 minute intervals.
Also - writes might not look so bad when averaged in with reads - often VMware stats (as well as stats in most flavors of UNIX except AIX) will lump together reads and writes. Really bad write latency can be a number that's just "meh" for read latency - so again putting read and write numbers together can hide a problem.
Tempdb disk IO patterns - especially spills - are very different than the IO patterns to persistent database files.
For one thing, spills are nearly always 50% read/50% write. Many "write heavy" workloads still have a 75% read/25% write pattern to the persistent database files.
Another differentiator is that spills to tempdb are much more aggressive - much higher queue lengths - than other writes.
If there are multiple tempdb files on a single Windows volume which is a single Windows physical disk, during a spill its not too hard for the disk queue to be overwhelmed. For a pvscsi vHBA, the default queue depth for the attached "physicalDisks" is 64. If the "current disk queue length" in perfmon shows a number higher than 64 - that is an indication of use of the wait queue at that level. It is possible to increase the queue depth of the individual physicalDisks and the pvscsi vHBA adapter. But that will help the most if the LUN queue depth of the physical LUNs in the ESXi host can handle it. If using RDM disks (more rare all the time), its a 1 to 1 relationship for host LUNs to guest physicalDisks. If not, multiple guest physicalDisks could have VMDKs in a single VMFS filesystem in a single datastore. Maybe that datastore is providing VMDKs to multiple VMs. In cases like that, the LUN queue depth and adapter queue depth in the VMware host can be overwhelmed.
If using VMDKs, it can also be important to check up on sioc (that kicks in to add additional throttling after average latency exceeds a threshold) and adaptive queue throttling (that kicks in when the array signals back that its queues are full).
Finally, full or nearly full write cache in the array is still a common performance challenge for some database workloads, and tempdb spills stress write cache.

Eliminating and taming spills can give storage a much needed break. But, I also encourage reviewing system configuration so that some amount of aggressive tempdb use can be absorbed.

I am trying to understand high tempdb writes. If ...

2017-08-18T19:06:02.169-07:00

I am trying to understand high tempdb writes. If the SAN/VMWare metrics look healthy, does this mean that there is too many IO requests for Windows to handle? And we need to look for a spill to resolve? Thanks for your articles!https://sql-sasquatch.blogspot.com/logout?d=https://www.blogger.com/logout-redirect.g?blogID%3D3617908412741200959%26postID%3D4967251104942345738

I'll post a followup in the future. Increasin...

2013-12-12T00:55:30.288-08:00

I'll post a followup in the future. Increasing the maximum transfer size had the intended effect of lowering the IOPs and current disk queue length across the board. But along the way we've realized there is a yet-undiagnosed issue limiting the overall throughput to the server, as well. Even though the full bandwidth at the host HBA and at the array front end adapter ports should allow for 1.6 gb/second, the server is never getting above 800 mb/second even though there is CPU to spare, long queue lengths (although shorter than before) and read/write latency still well below the targets.
So we opted not to change the tempdb configuration yet - once the bandwidth issue is addressed we can re-baseline. Then comparing that baseline against new perfmon data after tempdb config has changed will show the benefit of changing tempdb configuration. If we changed tempdb config while bandwidth was constrained, part of the benefit would be obscured. Removing the bandwidth bottleneck later would result in a performance benefit from the combined effect of increased bandwidth and more favorable tempdb config, and it would be very hard to sort out the level of benefit to the individual changes.

Very interesting. How was the issue resolved? Did ...

2013-12-11T21:52:06.472-08:00

Very interesting. How was the issue resolved? Did the recommendations help?