So when it comes to tempdb best practices, I like to remind folks that not all use of tempdb looks the same from the system side.
On the systems I work with, sort/hash spill is the main component of tempdb use. On many OLTP systems, rapid create/drop of temp tables is a much more prominent factor in tempdb considerations. On some systems, row versioning or perhaps use of large local or global temp tables is a significant consideration.
Two concerns arise from the various uses of tempdb:
- do they all stress allocation pages in the same way? Tempdb allocation page contention and resulting pagelatch waits are the classic drivers of increasing the tempdb data file count for a given workload.
- do they all utilize the storage subsystem in the same way? This is another, separate consideration in tempdb design which may influence decisions about which RAID level or disk group to put tempdb on, whether to use flash or HDD, how many Windows volumes to use, etc.
Today I want to turn to the question of how some tempdb activity engages with storage.
Sort and hash spill is my primary concern, but today I'm going to weasel out by using CREATE INDEX with the sort_in_tempdb option as a proxy for sort spill. At a later time (once my other tests behave a bit more) I'll come back to this and compare sort and hash spills to sort_in_tempdb activity.
The system I'm using is an 8 vcpu VM, set up with storage from an all flash array. SQL Server 2016 SP1.
Here's the setup for my test.
Crude, but effective. You can probably tell that I was initially working with a table 3 times as large and scaled it back :-) The large string in each row forces 1 row per page, giving a fairly predictable table size when the table is not compressed.
For the first test scenario, lets create a temp table, and insert all rows into it.
OK, now lets use a "SELECT... INTO" query.
Finally, lets use "CREATE INDEX" with sort_in_tempdb.
Now... lets see how that all turned out. 1 second perfmon captures below.
Interesting that the "select... into" took a little longer, isn't it :-) But the two temp table operations were close to each other in write bytes/sec to tempdb. Look how aggressive that "create index" was though! Maxed out adapter bandwidth.
Let's look at writes/sec rather than write bytes/sec.
Whoa. The "select... into" in the middle issued wwwaaaayyyy more writes, and way more writes/sec to tempdb than either of the other operations. But... each operation wrote the same amount to tempdb. So the "select... into" was using some pretty small writes in comparison.
Based on the number of writes, the fact that the "select... into" showed higher disk queue length than the first table option isn't too surprising. But... maybe its a little surprising that the "create index" has such high queue length, considering the much lower number of writes/second?
But if we look at sec/write... aha! The table insert and "select... into" operations have write latency staying well below 64ms. Its a little surprising to me that the "insert into" had higher peak write latency than the "select into..." That may certainly be an oddity of that particular test run. But I don't have a long enough attention span to repeat the test 10 or more times to even it out :-) Nah, the real surpise is the write latency over 300 ms in the "create index". On flash storage, too.
So... different uses of tempdb stress storage in different ways. If I'm correct that "create index" is a good proxy for hash/sort spills, then tempdb spills require special consideration in how to destage aggressive write bytes/sec.
And... a theme I'll come back to in the future... relevant to the high write latency experienced by the "Create index"...
strange things happen to service time at saturation.