Though perfmon "\SQLServer:Resource Pool Stats(default)\Avg Disk Read IO (ms)" is a very convenient measure, I don't have much confidence in it right now.
Reviewing recent observations, this graph stood out to me.
I am typically very careful with database layout, to optimize for the high IOPs, high outstanding IO workloads I test. A graph with no correlation between read IOPs and read service time is intriguing.
And I admit I spent too much time investigating exotic theories before I returned to a fundamental question: how confident am I in what is measured?
In this VM (the only VM active on the physical server), SQL Server is the sole significant resource consumer. In this workload, the default resource pool was the sole significant busy pool.
So I can compare measurements of the SQL Server Resource Governor pool to server-wide measurements.
That looks pretty tight. I expect server-wide read IOPs to sometimes exceed SQL Server read IOPs due to perfmon and other small consumers on the server. SQL Server read IOPs are sometimes reported slightly higher than server-wide read IOPs. I'm not too troubled by the level of difference I've seen, though.
Different story when considering read service times.
Blimey! That doesn't square at all. This workload was active over 8 LUNs for the active user database primary filegroup data files with a separate txlog LUN, and 2 LUNs for tempdb database data files with a separate txlog LUN. Maybe if I look at the maximum read service time across all of those LUNs I'll make better sense of the situation?
Nope.
Younger me would probably take some time to dig into this. But I'm old, and jaded. Just gonna report to Microsoft the same thing I'll tell you:
"\SQLServer:Resource Pool Stats(default)\Avg Disk Read IO (ms)" can't be trusted.
I'll update this post someday if I see that change.
Ciao!
No comments:
Post a Comment