VM Latency Sensitivity set to High still fails with no (proper) warning

Here is my proof

Test setup:

ESXi 8.0U3 (Also tested on ESX 7.0U3)
2xCPU 16cores each.
2 NUMA nodes
3xVMs with 9 vCPU each LS=HIGH and 100%CPU reservation

Below you can see an output from vcpu_affinity_info.sh handy script by Valentin Bondzio

[root@localhost:~] ./vcpu_affinity_info.sh
CID=2100566     GID=21068       LWID=2100656    Name=brick1

Group CPU Affinity:
   guest worlds:0-31
   non-guest worlds:0-31

Latency Sensitivity:
   -1

NUMA client 0:
   affinity: 0x00000003
   home: 0x00000001

      vcpuId  vcpu#  pcpu#  affinityMode  softAffinity   Affinity  ExAff
     2100656      0    16     2 -> sched            16       0-31    yes
     2100658      1    26     2 -> sched            26       0-31    yes
     2100659      2    22     2 -> sched            22       0-31    yes
     2100660      3    18     2 -> sched            18       0-31    yes
     2100661      4    23     2 -> sched            23       0-31    yes
     2100662      5    21     2 -> sched            21       0-31    yes
     2100663      6    19     2 -> sched            19       0-31    yes
     2100664      7    20     2 -> sched            20       0-31    yes
     2100665      8    30     2 -> sched            30       0-31    yes


CID=2100679     GID=21728       LWID=2100768    Name=brick2

Group CPU Affinity:
   guest worlds:0-31
   non-guest worlds:0-31

Latency Sensitivity:
   -1

NUMA client 0:
   affinity: 0x00000003
   home: 0x00000000

      vcpuId  vcpu#  pcpu#  affinityMode  softAffinity   Affinity  ExAff
     2100768      0     4     2 -> sched             4       0-31    yes
     2100770      1     1     2 -> sched             1       0-31    yes
     2100771      2    13     2 -> sched            13       0-31    yes
     2100772      3     3     2 -> sched             3       0-31    yes
     2100773      4     9     2 -> sched             9       0-31    yes
     2100774      5     7     2 -> sched             7       0-31    yes
     2100775      6     8     2 -> sched             8       0-31    yes
     2100776      7    11     2 -> sched            11       0-31    yes
     2100777      8     6     2 -> sched             6       0-31    yes


CID=2101305     GID=27125       LWID=2101394    Name=brick3

Group CPU Affinity:
   guest worlds:0-31
   non-guest worlds:0-31

Latency Sensitivity:
   -1

NUMA client 0:
   affinity: 0x00000003
   home: 0x00000000

      vcpuId  vcpu#  pcpu#  affinityMode  softAffinity   Affinity  ExAff
     2101394      0     2     2 -> sched          0-15       0-31     no
     2101396      1    12     2 -> sched          0-15       0-31     no
     2101397      2    10     2 -> sched          0-15       0-31     no
     2101398      3    14     2 -> sched          0-15       0-31     no
     2101399      4     0     2 -> sched          0-15       0-31     no
     2101400      5    10     2 -> sched          0-15       0-31     no
     2101401      6    14     2 -> sched          0-15       0-31     no
     2101402      7     0     2 -> sched          0-15       0-31     no
     2101403      8    15     2 -> sched          0-15       0-31     no

Here you can see brick1 and brick2 are both scheduled with ExAff as expected, but brick3 has no ExAff. Also vCPU 2,5 are scheduled on pCPU 10 and vCPU 3,6 on pCPU 14 therefore over-provisioned. The reason is obvious there is enough CPU to be reserved globally on the host, but not enough on the single NUMA node. The math here is quite simple 32-9-9=14, 16-9=7.

Now to be fair there is a small warning issued in the host’s event log:

Unable to apply latency-sensitivity setting to virtual machine brick3. No valid placement on the host.

And also vmkernel.log:

2024-06-27T09:12:02.773Z Wa(180) vmkwarning: cpu1:2101317)WARNING: CpuSched: 1400: Unable to apply latency-sensitivity setting to virtual machine brick3. No valid placement on the host.

But DRS has no issue powering on a such VM either, which complicates management of such VMs.

I have an SR open for this, as I don’t think it should work as it does right now, will try to keep you posted.

Bio
Latest Posts

Dusan Tekeljak

With over 12 years of experience in the Virtualization field, currently working as a Senior Consultant for Evoila, contracted to VMware PSO, helping customers with Telco Cloud Platform bundle. Previous roles include VMware Architect for Public Cloud services at Etisalat and Senior Architect for the VMware platform at the largest retail bank in Slovakia. Background in closely related technologies includes server operating systems, networking, and storage. A former member of the VMware Center of Excellence at IBM and co-author of several Redpapers. The main scope of work involves designing and optimizing the performance of business-critical virtualized solutions on vSphere, including, but not limited to, Oracle WebLogic, MSSQL, and others. Holding several industry-leading IT certifications such as VCAP-DCD, VCAP-DCA, VCAP-NV, and MCITP. Honored with #vExpert2015-2019 awards by VMware for contributions to the community. Opinions are my own!

Latest posts by Dusan Tekeljak (see all)

VM Latency Sensitivity set to High still fails with no (proper) warning - June 27, 2024
ESXi 6.7 U1 fixes: APD and VMCP is not triggered even when no paths can service I/Os - November 30, 2018
Update manager error: hosts could not enter maintenance mode - November 19, 2018

VM Latency Sensitivity set to High still fails with no (proper) warning

Here is my proof

Dusan Tekeljak

Latest posts by Dusan Tekeljak (see all)

About Dusan Tekeljak

Leave a ReplyCancel reply

Last Posts

Top Posts & Pages

Categories

VM Latency Sensitivity set to High still fails with no (proper) warning

Here is my proof

Dusan Tekeljak

Latest posts by Dusan Tekeljak (see all)

Share this:

About Dusan Tekeljak

Leave a ReplyCancel reply

Last Posts

Top Posts & Pages

Categories

Tags