All about Infiniband in Nexenta.

Added by Alex Destiny about 1 year ago

Hi. Last 3 months we are trying to configure Nexenta to work with Infiniband HCA. Mostly with no results. The HCA is Mellanox MHQH29B-XTR. The task is to make SRP target or to make iSCSI target (IPoverIB) to connect VMWare ESXi host. Somewhere in forums I've found information about the Infiniband support in Nexenta since build 3.0.4. Nexenta sales told me about 1Q of 2011 (official support). I have no need this support in nmc or web interface - just to configure it through CLI of OS. Could anyone explain all needs for this configuration (hardware/soft/drivers) and how to configure? Or any links, that contain how to configure IB in opensolaris/Nexenta (with RH/SLES/DEBIAN - there is no problem). Maybe it prefer QLogic HCA? Lets collect all information in this post. Best Regards, Alexey


Replies

RE: All about Infiniband in Nexenta. - Added by Mat Simon about 1 year ago

Hi

Infiniband isn't my area but a guy as made a great description how they got it up and running with OpenSolaris (RIP) then. They use SuperMicro mezzanine cards which are - tadaa: Mellanox ones. So I think you are on the good way.

Here are all his article tagged with infiniband: http://www.zfsbuild.com/category/infiniband/ Maybe you can check out: http://hub.opensolaris.org/bin/view/Project+comstar/WebHome

Do you already have a server running Subnet Manager fro your IB network? (I'd think so if you have IB already working)

On NexentaStor you will need to login as root without the nmc shell (Login as admin and su to root via SSH will give you a bash) And as long as Nexenta hasn't fully integrated IPoIB in their tools you might be better off with Nexenta NCP. NexentaStor doesn't like that much when you use the UNIX shell tools instead of nmc commands (nmv doesn't update correctly when not reloaded after usage).

On NCP you have UNIX shell and will have to work with COMSTAR commands but that is mainly what you are looking for.

Later then - and this is one great thing with SMF - you can export your COMSTAR config with svccfg export to XML and import it on another machine. (Still I'de recommend you to contact Nexenta for a migration to NexentaStor then). See: http://wikis.sun.com/display/OpenSolarisInfo/Backing+Up+and+Restoring+a+COMSTAR+Configuration

RE: All about Infiniband in Nexenta. - Added by Przemyslaw Ceglowski about 1 year ago

Hi Alex,

I've been testing IB with Nexenta 3.0.3 and vSphere 4.1 for some time now and I am very satisfied with results.

What we are running hardware wise is: ESXi 4.1 - 448397-B21 HP 4X DDR InfiniBand Dual Port PCIe HCA ConnectX, 25418 IB Switches - 2 x Infiniband 24 port Switch Cisco SFS-7000D 20Gbps DDR Managed NexentaStor HA Cluster - 448397-B21 HP 4X DDR InfiniBand Dual Port PCIe HCA ConnectX, 25418

We are using SRP for storage traffic and IPoIB for vMotion and FT. You will only need a Subnet Manager if you want to have IPoIB working. Fortunately our switches have Subnet Manager built in.

The procedure that I've followed to enable IBSRP is documented here - http://hub.opensolaris.org/bin/view/Project+srp/srptconfig We had to request beta drivers from Mellanox as 4.1 is not yet GA (should we shortly). There is some work involved in setting up the cards that would require reading the drivers manual. Dual ports HBA's on NexentaStor end are always presented as one target, however multipathing with with failover only is possible. You will need to change the port types on ESXi end if you would like to see both ports serving both IPoIB and SRP over VPI (the same link).

I am just waiting for Mellanox to release the GA of ESX/i4.1 drivers and will be switching our environment from 1GbE iSCSI over to 20GbIB :)

Good luck with your project! IB hardware is much much cheaper and faster then 10GbE or FC.

RE: All about Infiniband in Nexenta. - Added by Alex Destiny about 1 year ago

Hi I'd like to say just one thing - great thank you all for your links and expirience. Last two post looks like all knoledge that we need. Especially link http://hub.opensolaris.org/bin/view/Project+srp/srptconfig . All other thing we've already made :). Przemyslaw, could you share your result of testing with vSphere 4.1? I think it could be interesting for everyone following this way... I promise to public my own ones. Best Regards, Alexey

RE: All about Infiniband in Nexenta. - Added by Alex Destiny about 1 year ago

Hello,

Przemyslaw, could you please explain is there any tips to configure IB-card? We've done everything to configure COMSTAR, SRPT and all other things, but:

root@myhost:/export/home/admin# stmfadm list-target -v
    Target: eui.003048FFFFF62194
    Operational Status: Offline
    Provider Name     : srpt
    Alias             : -
    Protocol          : SRP
    Sessions          : 0

after

root@myhost:/export/home/admin# stmfadm online-target eui.003048FFFFF62194

nothing happens and dmesg tells

Sep 28 08:18:30 myhost srpt: [ID 780081 kern.notice] NOTICE: stp_ctl, no ports active for HCA 0x003048fffff62194. Target will not be placed online.

OK, next lets look to the

root@myhost:/export/home/admin# cfgadm -al
    Ap_Id                          Type         Receptacle   Occupant     Condition
    hca:3048FFFFF62194             IB-HCA       connected    configured   ok
    ib                             IB-Fabric    connected    configured   ok
    ib::iser,0                     IB-PSEUDO    connected    configured   ok
    ib::rdsib,0                    IB-PSEUDO    connected    configured   ok
    ib::sdpib,0                    IB-PSEUDO    connected    configured   ok
    ib::srpt,0                     IB-PSEUDO    connected    configured   ok

Looking to man cfgadm_ib tells us that maybe we need to configure the port...ok:

 cfgadm -o comm=port,service=srp -x add_service ib

and then

cfgadm -al
    Ap_Id                          Type         Receptacle   Occupant     Condition
    hca:3048FFFFF62194             IB-HCA       connected    configured   ok
    ib                             IB-Fabric    connected    configured   ok
    ib::3048FFFFF62195,0,srp       IB-PORT      connected    unconfigured unknown
    ib::iser,0                     IB-PSEUDO    connected    configured   ok
    ib::rdsib,0                    IB-PSEUDO    connected    configured   ok
    ib::sdpib,0                    IB-PSEUDO    connected    configured   ok
    ib::srpt,0                     IB-PSEUDO    connected    configured   ok

And no ideas about how and when to configure this IB-PORT (any further attempts lead us to nothing) ...Could someone help?

Best Regards, Alexey

RE: All about Infiniband in Nexenta. - Added by Alex Destiny about 1 year ago

Well, we've tried

root@myhost:/export/home/admin# cfgadm -c configure ib::3048FFFFF62195,0,srp
    cfgadm: Hardware specific failure: configure operation failed ap_id: /devices/ib:fabric::3048FFFFF62195,0,srp

And get nothing...

RE: All about Infiniband in Nexenta. - Added by Alex Destiny about 1 year ago

Hi, maybe we need a dedicated host (CentOS onboard) with Subnetwork Manager or Managed Switch with SM? What is the functions of SM? All things we've made are without any connections of the IB port. Best Regards, Alexey

RE: All about Infiniband in Nexenta. - Added by Przemyslaw Ceglowski about 1 year ago

I have to apologise for confusion I caused by saying that SRP does not require SM. Unfortuantely, It does. Each infiniband network requires a subnet manager. You can choose to run the OFED opensm subnet manager on one of the Linux clients (single point of failure), or you may choose to use an embedded subnet manager running on couple of the switches in your fabric. Note that not all switches come with a subnet manager; check your switch documentation.

You should also remember to upgrade your HCA firmware. My ConnectX 25418 required that in order to work.

Best wishes, Przem

RE: All about Infiniband in Nexenta. - Added by Alex Destiny about 1 year ago

Hi

After the installation of last openSM on dedicated host and connect Nexenta to it we set up SRP. But after that there is a problem with IB on ESXi 4.1. What special configuration needed after install the 4.1 drivers beta? All services are loaded, but we stiil cant see the volume to map on ESXi... Przemyslaw, could you please share your config expirience about ESXi? Maybe additional settings of initiator? That's the last point...

Best Regards, Alexey

RE: All about Infiniband in Nexenta. - Added by Przemyslaw Ceglowski about 1 year ago

Hi,

Here are the steps to enable the IB HCA under ESXi. Following them you will have VPI enabled and both SRP and IPoIB on each port of the dual port card.

Hardware Specification HP InfiniBand 4X DDR PCI-E Dual Port HCA (448397-B21) / Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)

  1. Place the ESXi server in the maintainance mode and power it down.

  2. Install the card and power on the server.

  3. Log in to vCenter and start vCLI.

    C:\Program Files (x86)\VMware\VMware vSphere CLI>cd bin C:\Program Files (x86)\VMware\VMware vSphere CLI\bin>vihostupdate.pl --server [ESXiIPADDRESS] -b d:\MEL-OFED-1.4.1-375-offline_bundle.zip -c -i

  4. Reboot the server.

  5. Configure default port on HCA to IB.

    C:\Program Files (x86)\VMware\VMware vSphere CLI\bin>vicfg-module.pl --server [ESXiIPADDRESS] -s "porttypedefault=1" mlx4_en

  6. Configure VPI on ports.

The pci bus id of the ConnectX devices can be retrieved from the interface names of the uplinks. Go to configuration->network adapters in vSphere client. The interface name is of the format vmnicX.pY, where X is the pci bus device id, and Y is the port number. For example: Vmnic4.p1 is installed on pci bus id 4. The value of 4 should be used as the first value in the VPI configuration triplet.

C:\Program Files (x86)\VMware\VMware vSphere CLI\bin>vicfg-module.pl --server [ESXiIPADDRESS] -s "porttypes=4,1,1,4,2,1" mlx4en

  1. Reboot the server.

  2. Exit maintenance mode.

RE: All about Infiniband in Nexenta. - Added by Jon Schillinger about 1 year ago

Hello All,

I am struggling to get IB working with Nexenta CE. Actually, the IB seems to be working but VMware doesn't see the LUNs properly. So, thanks to the information in this thread and much Googling, I have vSphere 4.1u1 (ESXi 4.1) configured, the Subnet Manager in the switch running, and Nexenta CE configured. The IB network seems to be up and working. vSphere hosts can see the LUNs on Nexenta, but they are 0 bytes. The same LUNs are currently shared over iSCSI and working fine, but on IB they are 0 bytes. I thought iSCSI was interfering somehow, so I disabled the iSCSI software initiator on one host and rebooted. It could still see the LUNs, but 0 bytes and it couldn't see the existing VMFS, nor allow me to create a new one. Any ideas??

Thanks!

RE: All about Infiniband in Nexenta. - Added by Gary K about 1 year ago

Actually thats a easy one (because i ran into a few days ago); ESX is limited to 2TB - 512b volumes; So when using SRP you have to make sure that when you touch the file that will be your data store, that you use a file size that is less then 2TB; so 2047g would work;

But now i have a question; Im setting up this whole new infrastructure and for the life of me cannot get this BX5020 to work for IPoIB; I can ping from vm to vm; but getting out to the eithernet network is proving to be a challenge. I actually down graded ESX to 4.0u1 because the beta drivers still have a lot to work out in them; When i do a arp lookup on the vm; it sees the other vm's but not the vlan interface on the cisco switch the bridge is uplinking to; My subnet manager is running on the IS5030; and it seems IPoIB is working fine within that confine.

Any ideas? Mellanox support seems to be pretty useless; every time i ask a question i have to wait 1-2 days because they just forward it to Isrial. When their soulution doesnt work, its the same process.

RE: All about Infiniband in Nexenta. - Added by Jon Schillinger about 1 year ago

D'oh! Thanks Gary, that was it. I created the data stores at 2.0TB exactly and they worked fine over iSCSI, so I don't know why they wouldn't work over SRP. Anyway, I resized the data stores to 2.0TB - 128K and they work now. It didn't even affect the VMFS (I tried on an empty VMFS first) so I didn't have to move VMs back and forth.

I'm not using IPoIB, so sorry I can't help with your problem.