vmware clone fast on iscsi, slow on nfs

Added by Matt Van Mater 12 months ago

Hi all,

I wanted to share an experience i've had with Nexenta 3.1.2 CE. My Nexenta storage is actually a Virtual Machine with 8 physical disks (4 mirror zvols) and 1 physical disk split into 2 virtual disks (1 log, 1 cache). This storage is then re-shared back to the parent ESXi 5.0 hypervisor for other VMs.

I have tested both NFS and iSCSI on this same storage pool, and a separate VM running iometer 2006. With some tuning on Nexenta, the iometer benchmarks are roughly equal in a single vNIC setup and iscsi is slightly better in a multi vnic setup. However, there are some notable real-world differences that are not apparent in the iometer results...

I have made the zfs storage pool accessible as ESXi datastores via both NFS and iSCSI protocols at the same time in VMWare. Then I created a base install of an Ubuntu 10.04.4 VM and made it a template on a local hardware raid mirror. I can deploy this template from mirror to both the datastores in about ~30 seconds (nfs is a little faster). However, when I clone that template again to test reading and writing from the same datastore, there are dramatic differences in performance. When I copy the VM from NFSdatastore --> NFSdatastore the process takes about 100 seconds; when I copy the VM from iscsidatastore --> iscsidatastore the process takes about 20 seconds. NFS is 5x slower for cloning!

I was thinking, how can this be? How can the NFS and iSCSI protocols have the same iometer results but have such different real-world perception?

I then cloned the VM from NFSdatastore --> iscsidatastore and the operation took about 100 seconds... but when i clone the VM from iscsidatastore to NFSdatastore the process was MUCH faster, about 30 seconds or so. Keep in mind that both NFS and iSCSI are being used concurrently to read/write to the same backend ZFS volume!

To me, this indicates that NFS read operations are somewhat slow on either VMWare 5.0 and/or Nexenta 3.1.2, but write operations are acceptable.

Has anyone else seen the same kind of disparity?

Here is some more detail on my Nexenta/VMware tweaks. Nexenta: Settings --> Network All interfaces MTU = 9000 Settings --> Preferences --> System Syszfsvdevmaxpending = 1 Syszfsnocacheflush = yes Settings --> Preferences --> Network Nettcpnaglim_def = 1 Nettcprecv_hiwat = 1073741824 Nettcpxmit_hiwat = 1073741824 Data Management --> SCSI Target --> View zvols --> myzvol Writeback Cache = off

ESXi: 2 vmknics on separate subnets (corresponding to Nexenta's vNICS) configured to use MTU 9000 iscsi storage path = round robin policy iops = 1


Replies

RE: vmware clone fast on iscsi, slow on nfs - Added by Jeff Gibson 12 months ago

Matt this sounds like iscsi is working as designed with VAAI extensions. Does your NFS datastore indicate it has hardware acceleration (I'm pretty sure it's just iSCSI that has that option)?

RE: vmware clone fast on iscsi, slow on nfs - Added by Matt Van Mater 12 months ago

Jeff Gibson wrote:

Matt this sounds like iscsi is working as designed with VAAI extensions. Does your NFS datastore indicate it has hardware acceleration (I'm pretty sure it's just iSCSI that has that option)?

Hi Jeff,

Nope, sorry there are no VAAI extentions running here, this is a Nexenta Community Edition install only, VAAI comes to paying customers only. I don't see any indications of hardware acceleration for the NFS datastore.

To reiterate the above: when going from a 2 disk mirror (HW RAID) to a 4 x 2 disk mirror (ZFS with cache+zil), the template is cloned successfully in about 30 seconds. So i expected that when cloning the same template from the 4x2 mirror to the same 4x2 mirror i will get at LEAST the same or better performance as the first test... and when using iSCSI it is better performance (only 20 seconds for the whole clone operation to complete). Instead with NFS, it takes almost 2 minutes. I have narrowed this down somewhat to NFS's read performance.

In case anyone is wondering, i am "stuck" on this issue due to the following pros/cons:

iSCSI

Pros

  • -vmware's multipath io gives ~40% improvement over NFS on sequential read peformance as shown by iometer
  • -cloning from template is significantly faster (as described above)

Cons

  • -seems to hang a big Win 2008 VM of mine after transmitting ~300 GB of a ~800GB data. This is repeatable, i can demonstrate to someone from Nexenta if requested. I can't determine if it is Nexenta or VMware's fault, but it definitely only occurs with iSCSI.
  • -VMWare does not handle iscsi failures gracefully... poor LUN connectivity will hang the entire hypervisor on ESXi 4.1, 4.1u2, 5.0u1 installs
  • -VMWare does not "try again" if an iSCSI mounted datastore is not immediately available when the hypervisor is booting (e.g. in my case when Nexenta VM is not available for several minutes after ESXi host is booted, and therefoe the iSCSI taget isn't available yet)

NFS

Pros

  • -VMWare's NFS seems to be more tolerant of network failures, exported NFS shares are mounted ~3 minutes afte ESXi is booted so i can autostart vms located on the NFS based datastore
  • -has never cashed/hung vms as pedictably as iscsi has
  • -more flexible for VMware 4.x that have 2TB LUN limits

Cons

  • -slower sequential throughput
  • -vmware does not show perfomance statistics for NFS based datastores in per-vm perfomancer monitor
  • -inexplicable latency accoding to VMware system-wide pefomancer monitor, but iometer does not reflect the same numbers
  • -no multipath io means a physical install is limited to single NIC only (no multipath io = can't aggregate pefomance fom multiple 1 Gb NICs without being a PITA)

RE: vmware clone fast on iscsi, slow on nfs - Added by Dan Swartzendruber 12 months ago

Yeah, Matt, I feel your pain. The unstable iSCSI target under heavy load hosed me once, and that was enough. The entire hypervisor was dead in the water and I had to hit the reset button. There was NOT happiness and joy here at that point... NFS has been 1000% stable and easy.

RE: vmware clone fast on iscsi, slow on nfs - Added by Bee Gee 10 months ago

Matt,

VAAI does work on Nexenta's iSCSI LUNs on VMFS datastores.

see link: http://blog.solori.net/2011/08/03/short-take-nexenta-3-1-adds-vaai-support-auto-sync-resume/

It would appear that it is still unsupported on NFS datastores. That's likely the reason for your fast performance on iSCSI and slow performance on NFS.

I ended up going 10GbE with NFS on our Nexenta build to get the speed we needed on the network. A clone that was taking 6 hours+ on a 1Gbps link now takes ~55-65 minutes.

All that being said, we went NFS 100% on our Nexenta solution because I need stability first and foremost and I do not trust the issues with iSCSI on Nexenta at this time.

RE: vmware clone fast on iscsi, slow on nfs - Added by Matt Van Mater 7 months ago

It has been a while so I thought I would check in...

  • Nexenta's own best pactices documents ecommend the use of NFS (http://info.nexenta.com/s/nexenta/images/5000-nxs-v0.0-000002-Anxstovmwaebestpactices.pdf, page 12)
  • VMware released support for Hardware Acceleration for NAS devices (i.e. NFS) a year and a half ago. (http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-50-Storage-Technical-Whitepaper.pdf, page 13)

So when will Nexenta support the VMWare NAS VAAI storage primitives (especially the Full File Clone)?

Nexenta has had a long time to come out with support for this option. Is there simply no customers driving this feature?

RE: vmware clone fast on iscsi, slow on nfs - Added by Matt Van Mater 7 months ago

No comments from Nexenta (Derek Glover)?

I saw references from over a year ago that Nexenta was "working on" this feature... so whats the deal?

RE: vmware clone fast on iscsi, slow on nfs - Added by Derek Glover 7 months ago

We are looking into when this can be integrated, tested and released. It will likely come after the upcoming 4.0 with Illumos shift.

RE: vmware clone fast on iscsi, slow on nfs - Added by Matt Van Mater 7 months ago

Thanks Derek... I realize this was scheduled for this past summer and was delayed, but can you comment on the new expected date for the 4.0 release?

By the way, if you testers, sign me up :)

RE: vmware clone fast on iscsi, slow on nfs - Added by Derek Glover 7 months ago

I don't have a date for release yet. But on the beta subject, check back in 1-2 weeks, hoping to have a public beta available soon.

RE: vmware clone fast on iscsi, slow on nfs - Added by Matt Van Mater 6 months ago

It has been 3 weeks :)

Nexenta CE 3.1.3 connected to and VMWare 4.1 (no VAAI) or 5.0 via NFS is still inexplicably slow compared to iscsi. Everything occurring over 1gbe Intel NICs.