IBM FlashSystem V9000 and VMware vSphere ESXi Guidelines

IBM FlashSystem V9000 General design guidelines for performance

Use one mdisk group per flash storage enclosure
For optimum performance use 4 (redundant) paths to your LUN
Use one host object per host defined in storage. Use more only if you need to reduce the number of paths – you have more than 2 HBA ports in your server
To get best Real-Time Compression performance use at least 8 compressed volumes (LUNs) per V9000. Regardless what sales people tell you, it is not good thing from performance point of view to create one big volume (and not even talking from VMware point of view). There are 8 threads dedicated for RTC and one volume can be handled by 1 thread only.
Use Round-Robin as multipathing policy

I definitely recommend you to check out our paper once it will be published if you want to know more.

VMware specific

ESXi is obviously coming with some preconfigured defaults which work great most of the time for standard environments. I’m not a huge fan of changing defaults, but it is needed sometimes if you want to get the best of it.

Consistent LUN numbering

Although, I think since ESXi 5.0, it is not required to have same number for LUN shared across whole ESXi cluster it is still recommended to keep it consistent. It is required if you are using RDM and MSCS clustering.

Round-Robin

If you are for whatever reason using ESXi version prior 5.5 you would have to change it manually.

Round-Robin path switching

By default ESXi is switching path after each 1000 IOPSs, which works generally fine in big environments with lots of LUNs and VMs. However for some workloads especially when you are dealing with single volume you can drastically improve your storage latency and throughput by decreasing this value.

http://kb.vmware.com/kb/2069356

You can do it for all volumes presented from V9000 (you will have to reboot ESXi to have it applied to already present volumes) – note this will actually change it for the other IBM Storwize based systems:

esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "IBM" -M "2145" -P "VMW_PSP_RR" -O "iops=1"

Or per LUN:

esxcli storage nmp psp roundrobin deviceconfig set --type=iops --iops=1 --device naa.xxxx

Adapter queue depth

This is something which you should change only if you know that this is your bottleneck already as increasing this value can increase throughput, but it can have negative impact on latency.

To check your queues and their utilization:

http://kb.vmware.com/kb/1027901

to change them:

http://kb.vmware.com/kb/1267

and to understand them:

https://blogs.vmware.com/vsphere/2012/07/troubleshooting-storage-performance-in-vsphere-part-5-storage-queues.html

vStorage APIs for Array Integration (VAAI)

This is something which is enabled by default, but if you don’t have it, make sure it is enabled.

Especially atomic locking (ATS) is a must, but accelerated init and copy will not hurt you either.

http://kb.vmware.com/kb/1021976

Important: do not forget to disable ATS Heartbeat feature if you are running vSphere 5.5 U2 or later.

https://www.thevirtualist.org/alert-application-outages-using-vaai-ats-on-vsphere-5-5-update2-vsphere-6-0/

HyperSwap

Make sure you have all your VMs running on HyperSwap volume on hosts at one site only, to do this create and maintain DRS “should-run” rules for your VMs based on datastores. HyperSwap has active-active architecture dynamically switching preferred site based on IOs issued. Obviously you will be suffering performance issues when issuing IOs from both sites to a single volume at time.

Dead Space Reclamation

Unfortunately FlashSystem V9000 does not support SCSI UNMAP for a dead space reclamation when using thin provisioning however there are still ways how to do it pretty easily.

As always first step would be to zero out all dead space, which you want to reclaim.

To do this from the operating system you can use a tool from Microsoft called “sdelete”
If you want to do it on VMware datastore you can just simple create and then delete a new thick eager zeroed virtual disk (vmdk) with size of the free space which you want to reclaim of course. You can do it from GUI by creating a new virtual machine, assigning new disk to existing one, or you can use vmkfstools from console.

If you are using Real-Time Compression on your volumes, then your work is done as RTC will reclaim it automatically!

In case of only thin provisioned volumes you would have to create a thin provisioned mirror of this volume and delete source volume after synchronization finishes (You have to do it on FlashSystem V9000).

That’s all for now, I hope it was helpful and don’t forget to share 😉

Update: added ATS Heartbeat into VAAI section comment. Thanks to Pavol for pointing that out

Bio
Latest Posts

Dusan Tekeljak

Experienced infrastructure architect and consultant with more than a decade of hands-on expertise in designing, deploying, and optimizing secure, high-performance cloud solutions across Europe and the Middle East. My focus is on VMware technologies, where I’ve led major implementations, architected mission-critical systems for telecom and finance clients, and contributed to industry knowledge as an IBM Redbooks co-author. With a collection of advanced certifications—including VCAP-DCD, VCAP-DCA, VCAP-NV, multiple VMware expert credentials—I combine technical leadership with practical delivery, consistently driving successful infrastructure transformations, operational excellence, and digital innovation for enterprise clients Opinions are my own!

Latest posts by Dusan Tekeljak (see all)

VM Latency Sensitivity set to High still fails with no (proper) warning - June 27, 2024
ESXi 6.7 U1 fixes: APD and VMCP is not triggered even when no paths can service I/Os - November 30, 2018
Update manager error: hosts could not enter maintenance mode - November 19, 2018

3 Comments

Pavol Babel
September 29, 2015 at 11:27 pm

Hi Dusan,

looking forward to V9000 + ESX Red Book, although I’m not interested into VMWARE stuff much, for sure it will be interesting. When thinking of missing reduplication, I keep smile on my face as I do not see many reasons for having that, RtC should be enough. The first competitor of RtC (as online compress engine) seems to be the new XtremIO, however the compression ratios achieved there are not amazing at this moment. As also RtC is not perfect an obviously has some issues with rotational disks, when used with Flash 840 or the new 900 it is brilliant.
I’m interested what could be the real use case for V9000 and VMWARE. No doubt V9000 is one of the fastest arrays nowadays, however is VMWARE able to utilise it in efficient way? ESX obviously do not like big VMs with many vCPU, due it quite unusual hypervisor scheduling policy. ESX does not have true NPIV, from the guest os perspective it is still virtual SCSI stuff, with quite a big latency overhead.

Wouldn’t be V9000 overkill for most ESX implantations in the world?

+1 for the dead space reclamation, round robin patch switching and that RtC needs at least 8 volumes.

Two little additional notes. Shouldn’t we still disable the AtsHeartBeat even for V9000? As there are no news for this in SVC 7.5 (and still think AtsHeartBeat is badly designed by VMWARE).
The second goes for RDM, I believe ESX 5.5 RDM does not require consistent LUN numbering across ESX cluster, however there is no doubt it is still best practice to keep it in tidy way.

Loading...

- Dusan Tekeljak
  September 30, 2015 at 12:04 am
  
  Dedup can be useful in some VDI environments where you cannot use linked clones, or development. There are definitely cases, but I agree it is not that important. Usually you don’t need to dedup Tier 0 data and compression would be more effective there. And don’t worry you can utilize it even with VMware ? https://blogs.vmware.com/performance/2015/08/project-capstone-shows-monster-vm-performance.html
  
  And thanks a lot about ATSHEARTBEAT, will update it tomorrow,of course it is needed, I forgot to mention most important thing and actually wrote huge shaded box in the paper about it.
  
  Loading...
  
Pingback: Interesting info on the IBM FlashSystem V9000 and VMware by Dusan Tekeljak | Finnzi!

IBM FlashSystem V9000 and VMware vSphere ESXi Guidelines

IBM FlashSystem V9000 General design guidelines for performance