Wednesday, March 1, 2017

Plotting the CPU in a VM shouldn't look like this...

***** Author's note *****
My half-baked drafts often contain more questions than answers 😀😀😀
When I get more information, I may be able to come back and update.  If I actually answer the questions, hopefully I'll promote to a full baked blog post.  No promises, though.

If this VMware half-baked draft intrigues you, this related half-baked draft might also be of interest.
Windows Guest: VMware Perfmon metrics intermittently missing?

I've become accustomed to perfmon graphs that look like this from within a VMware vm.

Its a 4 vcpu system.  When I first worked with VMs, I'd expect that metric to align with 100 * vcpu count - so a max value of 400.  But as you can see below it approaches 500.  Maybe that's because the non-vcpu worlds consumed physical CPU time on behalf of the VM on pcpus other than the 4 pcpus serving the vcpus.  (So the vcpus themselves could consume up to 400% with the non-vcpu worlds adding some beyond that)  It might also be due to calculations based on rated frequency of the pcpus and speedstep kicking in.  Could even be both? Maybe by the time I convert this from a half-backed draft to a full-fledged blog post I'll be able to propose a way to know what leads to the overage.

Here's the reason for this half-baked blog post: the graph below took me by surprise.

Something weird is going on - probably outside this VM - and severely effecting the reported relationship between guest vcpu utilization in 'Processor Info' and 'VM Processor'.

Its important to understand what 'VM Processor(*)\%Processor Time' means in the context of a VM.  It means the time there are runnable instructions already on the vcpu.  But the vcpu might not be bound to a pcpu beneath.  For example - what if the pcpus are oversubscribed, and two busy VMs are fighting over the same pcpus?  The vcpus in the guests could report 100% busy while only being bound to pcpus 50% of the time.  In that scenario, high %rdy time would be expected at the ESXi host level.  Could be a %COSTP situation, too - a scheduling requirement of a multi-vcpu VM.  Could be lots of halt & wakeup cycles, resulting in higher reported vcpu utilization in the guest than pcpu utilization in the host.  Lots of migrations of vcpus across NUMA nodes could also lead to higher vcpu utilization reported in guest than pcpu utilization at guest.

Could also be a memory mismatch.  If the guest is performing operations on vRAM which is actually not backed by physical RAM in the ESXi host, operations may be considered 100% CPU time within the guest even though the host registers a little bit of CPU time and a lot of wait time for paging/swap space traffic.  In a similar fashion, vMotion means a lot of memory traffic becomes disk traffic, and vcpu utilization in the guest can be exaggerated by sluggish interaction with vRAM.  But two hours of wall clock time is an afwul lot of vmotion :-)

But I can't shake the feeling that the nearly perfect 1:00 am to 3:00 am window means something very important here.  Maybe VM backups in the shared datastore than has this VMs vmdk?

I'll be back hopefully to update this one with more detail in the future...

*Current suspects include:
·         guest vRAM not backed by host physical RAM
·         oversubscribed pcpus & excessive %rdy
·         excessive %Costop
·         excessive migrations
·         excessive halt/wakeup cycles
·         vm backups in a shared datastore
·         excessive vmotion activity
·         patch apply to ESXi host?

No comments:

Post a Comment