Sorry, you need to enable JavaScript to visit this website.

NVMeoF using Intel® SSD and Intel® Omni-Path Architecture

BY Kien Dinh, Naoyuki Mori, Congmin Yin ON Jul 08, 2020

Introduction

Non-Volatile Memory Express over Fabrics (NVMeoF) is becoming one of the most disruptive technologies for data center storage. It is designed to deliver high throughput and low-latency NVMe* SSD technology over a network fabric. This document demonstrates how to use Intel® Omni-Path Architecture (Intel® OPA) and Intel® SSD Data Center Family for NVMe with open-source projects for the NVMeoF construction in native and cloud environments. Intel OPA supports remote direct memory access (RMDA), which can transfer memory data over the network without consuming CPU power, bringing great performance.

Hardware preparation

The demo system is composed of two Intel® Xeon® Scalable processor systems, each of them is equipped with an Intel OPA card 100HFA018FS (100Gbps). The cards are connected via an Intel OPA edge switch 100 series (100Gbps)*. One of the systems, the storage target system, is equipped with a number of Intel® Optane™ Solid State Drive P4800x (375GB). Figure 1 describes the demo setup.

 

Figure 1. Diagram of the demo setup

 

Follow the instructions in the EdgeSwitch* user manual to set up the switch. It is important to make sure that the fabric manager and its subnet manager are started correctly (See section 5.9.2).

System software setup

Clear Linux* installation

Follow the instructions to install Clear Linux from the live server to the servers. Install a Clear Linux version equal to or later than v31640 where the opa-fm tools for the Intel OPA fabric manager package is included in the hpc-utils bundle. It is the necessary utility to set up the subnet manager in case of direct back-to-back connection between the two Intel OPA units.

After the OS installation, perform the following steps to install necessary bundles and prepare the system.

1. Install the necessary bundles.

$ sudo swupd bundle-add os-clr-on-clr cloud-native-basic kvm-host-dev storage-utils openstack-common

    2. Load kernel modules for NVMe and RDMA on the target system with the NVMe SSD.

    $ sudo mkdir /etc/modules-load.d
    $ sudo vi /etc/modules-load.d/nvme-target.conf
           # Load modules for NVME target
           nvmet
           nvmet-rdma
    $ sudo reboot
    

      3. Verify the nvme module after rebooting.

      $ lsmod | grep nvmet
      nvmet_tcp               24576  0
      nvmet_rdma              28672  1
      rdma_cm                 61440  5 rpcrdma,nvme_rdma,nvmet_rdma,ib_iser,rdma_ucm
      ib_core                286720  16 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,nvme_rdma,nvmet_rdma,iw_cm,ib_iser,ib_umad,hfi1,i40iw,rdma_ucm,ib_uverbs,ib_cm,rdmavt
      nvmet                   81920  14 nvmet_tcp,nvmet_rdma
      

        4. Verify the rdma module after rebooting.

        $ lsmod | grep rdma
        nvme_rdma                36864  0
        nvme_fabrics             24576  1 nvme_rdma
        rpcrdma                 217088  0
        sunrpc                  344064  1 rpcrdma
        rdma_ucm                 28672  0
        rdmavt                   98304  1 hfi1
        ib_uverbs               122880  3 i40iw,rdma_ucm,rdmavt
        nvmet_rdma               28672  1
        rdma_cm                  61440  5 rpcrdma,nvme_rdma,nvmet_rdma,ib_iser,rdma_ucm
        iw_cm                    49152  1 rdma_cm
        ib_cm                    53248  1 rdma_cm
        ib_core                 286720  16 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,nvme_rdma,nvmet_rdma,iw_cm,ib_iser,ib_umad,hfi1,i40iw,rdma_ucm,ib_uverbs,ib_cm,rdmavt
        nvmet                    81920  14 nvmet_tcp,nvmet_rdma
        

          5. On the initiator system, execute the following commands.

          $ sudo swupd bundle-add devpkg-libiscsi devpkg-open-iscsi
          $ sudo mkdir /etc/modules-load.d
          $ sudo vi /etc/modules-load.d/nvme-initiator.conf
                 # Load modules for NVME target
                 nvme-rdma
          $ sudo reboot
          

            6. Verify the nvme kernel module after rebooting.

            $ lsmod | grep nvme
            nvme_rdma                 36864  0
            rdma_cm                   61440  4  rpcrdma,nvme_rdma,ib_iser,rdma_ucm
            ib_core                  286720  15 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,nvme_rdma,iw_cm,ib_iser,ib_umad,hfi1,i40iw,rdma_ucm,ib_uverbs,ib_cm,rdmavt
            nvme_fabrics              24576  1 nvme_rdma
            

              7. Verify the rdma kernel module after rebooting.

              $ lsmod | grep rdma
              rpcrdma               217088  0
              sunrpc                344064  1 rpcrdma
              rdma_ucm               28672  0
              rdmavt                 98304  1 hfi1
              ib_uverbs             122880  3 i40iw,rdma_ucm,rdmavt
              nvme_rdma              36864  0
              rdma_cm                61440  4 rpcrdma,nvme_rdma,ib_iser,rdma_ucm
              iw_cm                  49152  1 rdma_cm
              ib_cm                  53248  1 rdma_cm
              ib_core               286720  15 rdma_cm,ib_ipoib,opa_vnic,rpcrdma,nvme_rdma,iw_cm,ib_iser,ib_umad,hfi1,i40iw,rdma_ucm,ib_uverbs,ib_cm,rdmavt
              nvme_fabrics           24576  1 nvme_rdma
              

                The modules listed above the hfi1 and ib_core are the driver for the Intel OPA NICs and the kernel module for InfiniBand* Verbs API. We can see the RDMA implementation supported by Intel OPA, which is supported by default in the Clear Linux kernel.

                Test NVMeoF on bare metal

                NVMeoF using Intel OPA can be tested directly on bare metal, as described in the following steps.

                On the target system

                In this example, the IP address of the Intel OPA interface in the target system is set to 192.168.2.1. Note that in this example, the network interface name of the Intel OPA is ibp175s0, and it can be different on other systems.

                1. Assign a static IP address to the Intel OPA interface, by adding the following file to /etc/systemd/network folder.

                $ sudo vi /etc/systemd/network/70-omni-static.network
                [Match]
                Name=ibp175s0
                
                [Network]
                Address=192.168.2.1/24
                

                  2. Restart the systemd-networkd service.

                  $ sudo systemctl restart systemd-networkd

                   

                    3. Execute the following commands with root privilege. They can be summarized into a script to run on boot-up as well. Make sure that the necessary modules have been loaded beforehand.

                    # mkdir /sys/kernel/config/nvmet/subsystems/nvme-subsystem-rdma
                    # cd /sys/kernel/config/nvmet/subsystems/nvme-subsystem-rdma/
                    # echo 1 > attr_allow_any_host
                    # mkdir namespaces/10
                    # cd namespaces/10/
                    # echo -n /dev/nvme0n1> device_path
                    # echo 1 > enable
                    # mkdir /sys/kernel/config/nvmet/ports/1
                    # cd /sys/kernel/config/nvmet/ports/1/
                    # echo 192.168.2.1 > addr_traddr
                    # echo rdma > addr_trtype
                    # echo 4420 > addr_trsvcid
                    # echo ipv4 > addr_adrfam
                    # ln -s /sys/kernel/config/nvmet/subsystems/nvme-subsystem-rdma /sys/kernel/config/nvmet/ports/1/subsystems/nvme-subsystem-rdma
                    

                      4. Verify the port 1 enabling by issuing the following command.

                      $ dmesg | grep "enabling port 1"
                      [ 3067.082996] nvmet_rdma: enabling port 1 (192.168.2.1:4420)
                      

                        On the initiator/client system

                        In this example, the IP of the Intel OPA interface in the target system is set to 192.168.2.2. Note that in this example, the interface name of the Intel OPA is ibp176s0, and it can be different on other systems.

                        1. Assign a static IP address to the Intel OPA interface, by adding the following file to /etc/systemd/network folder.

                        $ sudo vi /etc/systemd/network/70-omni-static.network
                        [Match]
                        Name=ibp176s0
                        
                        [Network]
                        Address=192.168.2.2/24
                        

                          2. Restart the systemd-networkd service.

                          $ sudo systemctl restart systemd-networkd

                            3. The NVMe device attached to the target can be discovered by executing the following command.

                            $ sudo nvme discover -t rdma -a 192.168.2.1 -s 4420
                            Discovery Log Number of Records 1, Generation counter 24
                            =====Discovery Log Entry 0======
                            trtype:  rdma
                            adrfam:  ipv4
                            subtype: nvme subsystem
                            treq:    not specified, sq flow control disable supported
                            portid:  1
                            trsvcid: 4420
                            subnqn:  nvme-subsystem-rdma
                            traddr:  192.168.2.1
                            rdma_prtype: not specified
                            rdma_qptype: connected
                            rdma_cms:    rdma-cm
                            rdma_pkey: 0x0000
                            

                              4. That remote NVMe SSD can be connected and used as the normal local SSD with the following commands.

                              $ sudo nvme connect -t rdma -n nvme-subsystem-rdma -s 4420 -a 192.168.2.1
                              $ lsblk
                              NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
                              sda       8:0    0 223.6G  0 disk
                              ├─sda1    8:1    0   142M  0 part
                              ├─sda2    8:2    0   244M  0 part
                              └─sda3    8:3    0 223.2G  0 part /
                              nvme0n1 259:4    0 349.3G  0 disk
                              

                                5. Note that on this initiator, the newly connected NVMe SSD device is nvme0n1. The name can be different on other systems. The performance of the NVMeoF SSD can be tested by using FIO.

                                $ sudo fio --blocksize=4k --iodepth=32 --rw=randwrite --ioengine=libaio --ramp_time=10 --runtime=60 --group_reporting --thread --direct=1 --name=/dev/nvme0n1 --numjobs=8
                                

                                  6. Parameters should be changed to suit the desired test scenarios. After that, the remote NVMe SSD can be disconnected with the following command.

                                  $ sudo nvme disconnect -d /dev/nvme0n1
                                  

                                    NVMeoF on Kubernetes* with SODA*

                                    SODA/OpenSDS is the new open source project under the Linux Foundation*, which is an integration solution for cloud storage. It provides container storage interface (CSI) plugins for Kubernetes to support many storage types, including NVMeoF. This section describes how to configure and enable NVMeoF on Kubernetes using OpenSDS.

                                    Build and install OpenSDS on the target host

                                    1. An OpenSDS hotspot needs to be configured, built, and installed on the target host so that it becomes the OpenSDS endpoint to provide storage to others.

                                    $ go get github.com/opensds/opensds
                                    $ cd ~/go/src/github.com/opensds/opensds
                                    

                                      2. Change the transport type of NVMe from TCP to RDMA.

                                      $ sed -i 's/transtype = "tcp"/transtype = "rdma"/g' contrib/drivers/lvm/targets/targets.go
                                      $ export GRPC_GO_REQUIRE_HANDSHAKE=off
                                      $ make
                                      $ sudo mkdir -p /usr/local/bin
                                      $ sudo cp build/out/bin/* /usr/local/bin/
                                      

                                        3. Modify the install script and comment out the unnecessary parts since Clear Linux already has them with the bundles. Change the LVM_DEVICE parameter to point to the NVMe device to be used over RDMA. In this example, LVM_DEVICE is set to /dev/nvme0n1 by default.

                                        $ sudo mkfs.ext4 -F /dev/nvme0n1
                                        $ vi ./install/devsds/lib/lvm.sh
                                              # in function osds::lvm::install() comment out the following:
                                              # osds::lvm::pkg_install
                                              # osds::nfs::pkg_install
                                              # osds::lvm::nvmeofpkginstall
                                        

                                          4. Edit the local.conf file. This example uses vi; you may use any text editor.

                                          $ vi ./install/devsds/local.conf

                                           

                                            # modify OPENSDS_AUTH_STRATEGY to use keystone OPENSDS_AUTH_STRATEGY=keystone

                                          5. Run install.sh and export the following:

                                          $ sudo ./install/devsds/install.sh
                                          $ export OPENSDS_AUTH_STRATEGY=keystone
                                          $ export OPENSDS_ENDPOINT=http://localhost:50040
                                          $ export OS_AUTH_URL=http://localhost/identity
                                          $ export OS_USERNAME=admin
                                          $ export OS_PASSWORD=opensds@123
                                          $ export OS_TENANT_NAME=admin
                                          $ export OS_PROJECT_NAME=admin
                                          $ export OS_USER_DOMAIN_ID=default
                                          $ export HOST_IP=localhost
                                          

                                            6. When the installation successfully completes, confirm that the NVMe is mounted correctly to an OpenSDS folder.

                                            $ lsblk
                                            NAME           MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
                                            sda              8:0    0 223.6G  0 disk
                                            ├─sda1           8:1    0   142M  0 part
                                            ├─sda2           8:2    0   244M  0 part
                                            └─sda3           8:3    0 223.2G  0 part /
                                            sr0             11:0    1  1024M  0 rom 
                                            nvme0n1        259:0    0 349.3G  0 disk /opt/opensdsNvme/opensds-volumes-nvme
                                            

                                              7. OpenSDS can be interacted using the command below. (Use ‘osdsctl --help’ for more information.)

                                              $ osdsctl pool list
                                              

                                                8. Create a new profile that supports nvmeof protocol.

                                                $ osctl profile create '{"name": "nvmeRDMA", "description": "default policy", "storageType": "block", "provisioningProperties":{"ioConnectivity": {"accessProtocol": "nvmeof"}}}'
                                                

                                                  9. Use the following command to see the profileID of nvmeRDMA profile. It will be necessary later in the K8s pod that requires remote volume.

                                                  $ osdsctl profile list
                                                  

                                                    Install Kubernetes to work with OpenSDS on the initiator/client host

                                                    1. Perform the steps in this tutorial to install the K8s cluster into the ClearLinux initiator/client host. Remember to set the proxies, if necessary.

                                                    $ go get github.com/opensds/opensds
                                                    $ cd ~/go/src/github.com/opensds/opensds
                                                    $ make
                                                    $ sudo mkdir -p /usr/local/bin
                                                    $ sudo cp build/out/bin/* /usr/local/bin/
                                                    $ export OPENSDS_AUTH_STRATEGY=keystone
                                                    $ export OPENSDS_ENDPOINT=http://192.168.2.1:50040  # IP of the target host
                                                    $ export OS_AUTH_URL=http://192.168.2.1/identity     # IP of the target host
                                                    $ export OS_USERNAME=admin
                                                    $ export OS_PASSWORD=opensds@123
                                                    $ export OS_TENANT_NAME=admin
                                                    $ export OS_PROJECT_NAME=admin
                                                    

                                                      2. Now, osdsctl can access the target in the same way.

                                                      $ osdsctl pool list
                                                      

                                                        3. Set a local docker registry to hold the container images, including OpenSDS plugins.

                                                        $ docker run -d -p 5000:5000 --restart=always --name registry registry:2
                                                        

                                                          4. Download and build the OpenSDS Kubernetes CSI plugins.

                                                          $ go get github.com/opensds/nbp
                                                          $ cd ~/go/src/github.com/opensds/nbp
                                                          $ sed -i 's/tcp/rdma/g' vendor/github.com/opensds/opensds/contrib/connector/nvmeof/nvmeof_helper.go
                                                          $ make docker
                                                          $ docker tag opensdsio/csiplugin localhost:5000/opensdsio/csiplugin
                                                          $ docker push localhost:5000/opensdsio/csiplugin
                                                          

                                                            5. Run the K8s pods that contain the OpenSDS plugins.

                                                            $ sed -i 's/image: opensdsio\/csiplugin/image: localhost:5000\/opensdsio\/csiplugin/g' csi/server/deploy/kubernetes/*
                                                            

                                                              6. Change the config of OpenSDS endpoint in the config file.

                                                              $ vi csi/server/deploy/kubernetes/csi-configmap-opensdsplugin.yaml
                                                              opensdsendpoint: http://192.168.2.1:50040
                                                              opensdsauthstrategy: keystone
                                                              osauthurl: http://192.168.2.1/identity
                                                              

                                                                7. Deploy the OpenSDS CSI plugins for Kubernetes.

                                                                $ kubectl create -f csi/server/deploy/kubernetes/
                                                                $ kubectl get pod
                                                                NAME                                 READY   STATUS    RESTARTS   AGE
                                                                csi-attacher-opensdsplugin-0         3/3     Running   0          4d9h
                                                                csi-nodeplugin-opensdsplugin-w9qxp   2/2     Running   0          4d9h
                                                                csi-provisioner-opensdsplugin-0      2/2     Running   0          4d9h
                                                                csi-snapshotter-opensdsplugin-0      2/2     Running   0          4d9h
                                                                

                                                                  8. Prepare and run an example workload that uses the persistent volume from NVMe over RDMA. Modify the profile ID of the NVME profile that you created in the previous section. It can be obtained by using the osdsctl profile list command. For example:

                                                                  $ vi csi/server/examples/kubernetes/nginx.yaml
                                                                  profile: 6f7b121d-646a-45a8-b573-4f7ebd2a3cb4
                                                                  $ kubectl apply -f csi/server/examples/kubernetes/nginx.yaml
                                                                  $ kubectl get pod
                                                                  NAME                                 READY   STATUS    RESTARTS   AGE
                                                                  csi-attacher-opensdsplugin-0         3/3     Running   0          4d9h
                                                                  csi-nodeplugin-opensdsplugin-w9qxp   2/2     Running   0          4d9h
                                                                  csi-provisioner-opensdsplugin-0      2/2     Running   0          4d9h
                                                                  csi-snapshotter-opensdsplugin-0      2/2     Running   0          4d9h
                                                                  nginx                                1/1     Running   0          4d9h
                                                                  

                                                                    9. The IO benchmark using “fio” can be launched from inside the new nginx pod.

                                                                    $ vi csi/server/examples/kubernetes/nginx.yaml
                                                                    profile: 6f7b12$ kubectl exec -it nginx sh
                                                                    # mount | grep nvme
                                                                    /dev/nvme0n1 on /var/lib/www/html type ext4 (rw,relatime)

                                                                     

                                                                      # apt update # apt install fio # fio --blocksize=1m --iodepth=8 --rw=randwrite --ioengine=libaio --size=40G --ramp_time=10 --runtime=60 --group_reporting --thread --direct=1 --name=fio-write --numjobs=8 --filename=/var/lib/www/html/test

                                                                    Here is a demo video showing how NVMeoF can be configured and used with SODA/OpenSDS in Kubernetes.

                                                                    NVMeoF with Intel® Virtual RAID on CPU

                                                                    Intel® Virtual RAID on CPU (Intel® VROC) is a hybrid RAID solution designed for NVMe SSDs connected directly to CPUs. It provides compelling RAID performance that unleashes the full potential of NVMe drives without additional hardware, such as PCIe RAID cards. This section describes how to configure Intel VROC and use different levels of RAID on the network fabrics in the OpenSDS environment.

                                                                     

                                                                    Follow the installation guide to install the Intel VROC upgrade key to the system. In our setup, a premium key (VROCPREMMOD) has been used. Figure 2 shows the key being plugged into the mainboard.

                                                                     

                                                                    Figure 2. Hardware setup with an Intel VROC premium upgrade key

                                                                     

                                                                     

                                                                     

                                                                     

                                                                     

                                                                    Figure 3. BIOS setup for RAIDs using the Intel VROC with VMD

                                                                     

                                                                    Figure 3 depicts the steps to setup RAID0 mode in the Intel VROC with 4 Intel® Optane™ SSD DC P4800X Series PCIe drives. The settings can be accessed via the BIOS menu (F2). From Advanced => PCI Configurations => UEFI Option ROM Control => Intel Virtual RAID on CPU. Here users can create/delete various RAID mode configurations.

                                                                    RAID mode control using software is also supported by Clear Linux using the mdadm tool. Refer to the guide for more information. Here are some examples to set up different RAID modes using the mdadm tool. The first line is to reset the current one.

                                                                     

                                                                    • RAID0 with 4 drives
                                                                    • $ sudo mdadm -S -s
                                                                      $ sudo mdadm -C /dev/md/imsm0 /dev/nvme[0-3]n1 -n 4 -l 0
                                                                      
                                                                      1. RAID0 with 2 drives
                                                                        $ sudo mdadm -S -s
                                                                        $ sudo mdadm -C /dev/md/imsm0 /dev/nvme[0-1]n1 -n 2 -l 0
                                                                        
                                                                        1. RAID5 with 4 drives
                                                                          $ sudo mdadm -S -s
                                                                          $ sudo mdadm -C /dev/md/imsm0 /dev/nvme[0-3]n1 -n 4 -l 5
                                                                          
                                                                          1. RAID10 with 4 drives
                                                                            $ sudo mdadm -S -s
                                                                            $ sudo mdadm -C /dev/md/imsm0 /dev/nvme[0-3]n1 -n 4 -l 10
                                                                            

                                                                            After that, the new configuration can be verified. For example, for the case of RAID5, execute the following command.

                                                                            $ lsblk
                                                                            NAME    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
                                                                            sda       8:0    0 223.6G  0 disk 
                                                                            ├─sda1    8:1    0   142M  0 part 
                                                                            ├─sda2    8:2    0   244M  0 part 
                                                                            └─sda3    8:3    0 223.2G  0 part  /
                                                                            sr0      11:0    1  1024M  0 rom  
                                                                            nvme0n1 259:0    0 349.3G  0 disk 
                                                                            └─md127   9:127  0     1T  0 raid5
                                                                            nvme1n1 259:1    0 349.3G  0 disk 
                                                                            └─md127   9:127  0     1T  0 raid5
                                                                            nvme2n1 259:2    0 349.3G  0 disk 
                                                                            └─md127   9:127  0     1T  0 raid5
                                                                            nvme3n1 259:3    0 349.3G  0 disk 
                                                                            └─md127   9:127  0     1T  0 raid5
                                                                            
                                                                            

                                                                              Details of the newly-created RAID volume can be checked by executing the following command.

                                                                              $ sudo mdadm --detail /dev/md127
                                                                              /dev/md127:
                                                                                         Version : 1.2
                                                                                   Creation Time : Fri Feb  7 14:32:59 2020
                                                                                      Raid Level : raid5
                                                                                      Array Size : 1098481152 (1047.59 GiB 1124.84 GB)
                                                                                   Used Dev Size : 366160384 (349.20 GiB 374.95 GB)
                                                                                    Raid Devices : 4
                                                                                   Total Devices : 4
                                                                                     Persistence : Superblock is persistent
                                                                               
                                                                                   Intent Bitmap : Internal
                                                                               
                                                                                     Update Time : Wed Jun  3 11:50:57 2020
                                                                                           State : clean
                                                                                  Active Devices : 4
                                                                                 Working Devices : 4
                                                                                  Failed Devices : 0
                                                                                   Spare Devices : 0
                                                                               
                                                                                          Layout : left-symmetric
                                                                                      Chunk Size : 512K
                                                                               
                                                                              Consistency Policy : bitmap
                                                                               
                                                                                            Name : clr1-nvmetarget:imsm0  (local to host clr1-nvmetarget)
                                                                                            UUID : 37fe1cd0:f723162d:ddb5c702:49729f89
                                                                                          Events : 144
                                                                               
                                                                                  Number   Major   Minor   RaidDevice State
                                                                                     0     259        0        0      active sync   /dev/nvme0n1
                                                                                     1     259        1        1      active sync   /dev/nvme1n1
                                                                                     2     259        3        2      active sync   /dev/nvme3n1
                                                                                     4     259        2        3      active sync   /dev/nvme2n1
                                                                              
                                                                              

                                                                                The NVMe SSDs in RAID5 mode shown above have been mapped into the /dev/md127 device. To use them as NVMeoF in Kubernetes (with SODA), it is necessary to enable them through two steps, similarly to the case of the single NVMe SSD (/dev/nvme0n1) device. We need to enable NVMeoF in the target system, and then specify the device as LVM_DEVICE in SODA/OpenSDS setup.

                                                                                Use the following script to enable the NVMe SSDs under RAID mode.

                                                                                #!/bin/bash
                                                                                mkdir /sys/kernel/config/nvmet/subsystems/nvme-raid-rdma
                                                                                cd /sys/kernel/config/nvmet/subsystems/nvme-raid-rdma/
                                                                                echo 1 > attr_allow_any_host
                                                                                mkdir namespaces/10
                                                                                cd namespaces/10/
                                                                                echo -n /dev/md127> device_path
                                                                                echo 1 > enable
                                                                                mkdir /sys/kernel/config/nvmet/ports/2
                                                                                cd /sys/kernel/config/nvmet/ports/2/
                                                                                echo 192.168.2.1 > addr_traddr
                                                                                echo rdma > addr_trtype
                                                                                echo 4421 > addr_trsvcid
                                                                                echo ipv4 > addr_adrfam
                                                                                ln -s /sys/kernel/config/nvmet/subsystems/nvme-raid-rdma /sys/kernel/config/nvmet/ports/2/subsystems/nvme-raid-rdma
                                                                                

                                                                                  The script is similar to the single NVMe (no RAID) case, but here it is enabled at a different port and name. If there are more SSDs, they can be configured to be in a combination of different RAID modes at the same time—which VROC supports—and at the same time can be enabled as NVMeoF using different ports and names.

                                                                                  To specify the /dev/md127 in SODA/OpenSDS, refer to section Building and installing OpenSDS on the target host and replace the value of LVM_DEVICE in the lvm.sh file to /dev/md127.

                                                                                  We have measured the performance of the NVMe SSDs, in local and over the fabrics at different RAID modes. Figure 4 shows the results of bandwidth (randwrite & randread) at 64k block size**.

                                                                                  ** Please see Appendix for footnotes.

                                                                                  NVMeoF v.s. Local NVMe performance measurement

                                                                                  Figure 4. Data rate in MB/s for the read and write (randread & randwrite using FIO) for different block sizes for 1 drive (no RAID), 2 drives (RAID0), 4 drives (RAID0), 4 drives (RAID10) and 4 drives (RAID5) and those while connected over Intel OPA.

                                                                                  We can see that Intel VROC helps enable different modes of RAID, and that NVMeoF enables access speeds almost equivalent to the local performance.

                                                                                  Here is a demo video showing 4 NVMe SSD in RAID0 mode over fabrics being used in Kubernetes. You'll see how to set up the system in a step by step manner, then how to run the benchmark.

                                                                                  Conclusion

                                                                                  In this article, we have demonstrated how to use Intel Omni-Path Architecture (Intel OPA) and Intel SSD Data Center Family for NVMe with open-source projects for the NVMe over fabrics construction - in native and cloud environments. We have also shown that NVMeoF can be used in combination with Intel Virtual RAID on CPU technology to provide a simple RAID solution with compelling performance that unleashes the full potential of NVMe drives over the fabrics.

                                                                                  Appendix

                                                                                  Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks.

                                                                                  Performance results are based on testing as of November 17, 2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

                                                                                  System information

                                                                                  Target system

                                                                                  System configuration: Intel® Server Board S2600WFT family, Intel® Xeon® 8200 Series Processors, 24 cores@ 2.4GHz, RAM 192GB , BIOS Release 02/27/2019, BIOS Version: SE5C620.86B.0D.01.0395.022720191340

                                                                                  OS: Clear Linux OS v31640, kernel-5.3.11-868.native, mdadm - v4.1, Intel ® VROC Premium version 6.0.0.1024, 4x Intel® SSD DC P4800X Series 375GB drive firmware: E2010435

                                                                                  BIOS setting: Hyper-threading enabled, Package C-State set to C6(non retention state) and Processor C6 set to enabled, P-States set to default and SpeedStep and Turbo are enabled

                                                                                  Workload Generator: FIO 3.19, RANDOM: Workers-8, IOdepth- 32, No Filesystem, CPU Affinitized

                                                                                  Storage: 4x Intel® SSD DC P4800X Series, 375 GB, Firmware: E2010423, SSDPE21K375GA)

                                                                                  Network: Intel® OPA 100HFA018FS, Firmware: 1.27.0

                                                                                  Initiator system

                                                                                  System configuration: Intel® Server Board S2600WFT family, Intel® Xeon® 8200 Series Processors, 24 cores@ 2.3GHz, RAM 192GB , BIOS Release 10/04/2018, BIOS Version: SE5C620.86B.0D.01.0134.100420181737

                                                                                  OS: Clear Linux OS v31640, kernel-5.3.11-868.native

                                                                                  BIOS setting: Hyper-threading enabled, Package C-State set to C6(non retention state) and Processor C6 set to enabled, P-States set to default and SpeedStep and Turbo are enabled

                                                                                  Workload Generator: FIO 3.19, RANDOM: Workers-8, IOdepth- 32, No Filesystem, CPU Affinitized, Kubernetes v0.16.2

                                                                                  Network: Intel® OPA 100HFA018FS, Firmware: 1.27.0

                                                                                  Disclaimers

                                                                                  Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No product or component can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

                                                                                  Intel, Xeon, Optane, and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

                                                                                  *Other names and brands may be claimed as the property of others.

                                                                                  © 2020 Intel Corporation