For enterprises deploying NVMe over Fabric, choosing between Fibre Channel and RDMA can be difficult, because both have advantages and disadvantages.
In the last few years, enterprises have been getting hungrier for infrastructure that provides high throughput with low latency and greater performance for hosted applications. Faster networking with high-speed Ethernet, Fibre Channel, and Infiniband offers end-to-end speed varying from 10 Gb/s to 128 Gb/s.
Enterprises are also starting to realize the performance and latency benefits offered by the NVMe protocol with storage arrays featuring high-speed NAND flash and next-generation SSDs.
But a latency bottleneck has arisen in the implementation of shared storage or storage area networking where data needs to be transferred between the host (initiator) and the NVMe-enabled storage array (target) over Ethernet, RDMA technologies (iWARP/RoCE), or Fibre Channel.
The NVMe bottleneck
Latency gets high when SCSI commands transported by Fibre Channel require interpretation and translation into NVMe commands.
NVMe over fabrics (NVMe-oF) is a network protocol introduced by NVM Express to address this bottleneck. NVMe-oF replaced iSCSI as a storage networking protocol, allowing enterprises to experience the full benefits offered by NVMe-enabled storage arrays. NVMe-oF acts as a messaging layer between the host computer and target SSDs or a shared system network over ultra-high speed RDMAs/Fibre Channels.
NVMe-oF supports five technologies: RDMA (RoCE, iWARP), Fibre Channel (FC-NVMe), Infiniband, Future Fabrics, and Intel Omni-Path architecture.
In addition, NVMe-oF allows separation of control traffic and data traffic, which further simplifies traffic management. Also, it takes advantage of the internal parallelism of storage devices and lowers I/O overhead. This enhances overall data access performance to reduce latency.
NVMe-oF offers a performance boost to enterprises that are deploying machine learning applications, big data, and Internet of Things (IoT) analytics, which demand real-time access to stored data without any distance dependencies.
Performance evaluation of NVMe-oF over Fibre Channel and RDMA
Recent conferences have sparked debate about which transport channel delivers the best performance using the NVMe-oF protocol. Some vendors firmly believe that RDMA is a better option for higher throughput, and many vendors stick to Fibre Channel to gain performance advantages.
Both network fabric technologies have their own benefits and pitfalls.
NVMe over Fabrics using Fibre Channel
NVMe over Fibre Channel relies on two standards: NVMe-oF and FC-NVMe. NVMe-oF is the protocol offered by NVM Express organization for enabling transportation of NVMe traffic over network fabric, and FC-NVMe is the Fibre Channel-specific transport standard. The combination of both serves as a solution. A majority of enterprises are already using Fibre Channel technology to process their critical data to and from storage arrays.
Fibre Channel was specially designed for storage device and systems, and it is the de facto standard for enterprise storage area networking (SAN) solutions. The main advantage of Fibre Channel technology is that it provides concurrent traffic for existing traditional storage protocols — SCSI — and the new NVMe protocol using the same hardware resources in storage infrastructure. This co-existence of SCSI and NVMe on Fibre Channel benefits most of enterprises because they can enable NVMe operations with just a simple software upgrade.
In March 2018, NVM Express added a new feature called Asymmetric Namespace Access (ANA) to the NVMe-oF protocol. This allows multi-path I/O support among multiple hosts and namespaces.
Gen 5 and Gen 6 are new versions of Fibre Channel. Gen 6 supports transfer speeds up to 128Gbs, i.e. the highest in storage networking. Additionally, Gen 6 enables monitoring and diagnostics capabilities that enable visibility into latency levels and IOPS. NVMe-oF seamlessly integrates with both new versions of Fibre Channel protocols.
As per a Demartek report, NVMe over Fibre Channel delivers 58% higher IOPS and 34% lower latency than SCSI-based Fibre Channel protocol. Large enterprises favor the use of FC-NVMe for processing critical workloads due to its simplicity, reliability, predictability, and performance.
However, this implementation requires more expertise at the storage networking level, which may add costs.
NVMe over Fabrics using RDMA
RDMA offers an alternative to Fibre Channel. According to WhatIs.com, “Remote Direct Memory Access (RDMA) is a technology that allows computers in a network to exchange data in main memory without involving the processor, cache, or operating system of either computer.”
In other words, RDMA allows applications to bypass the software stack for processing network traffic. Because RDMA data transfer does not involve so many resources, RDMA helps enterprises achieve higher throughput and better performance with lower latency. NVMe-enabled storage devices appear to be near to the host with RDMA.
RDMA can be enabled in storage networking with protocols like RoCE (RDMA over Converged Ethernet), iWARP (internet wide area RDMA protocol), and Infiniband.
iWARP is roughly an RDMA over TCP/IP. It uses TCP and Stream Control Transmission Protocol (SCTP) for data transmission.
RoCE enables RDMA over Ethernet. It is described as Inifiniband over Ethernet. There are two versions of RoCE v1 and RoCE v2. Both of these protocols are incompatible with each other due to different transport mechanisms.
Inifiniband is largely supported by vendors offering high-performance computing solutions. It is the fastest RDMA storage networking technology having data transfer speed around 100 Gbs, compared to the up to 128 Gb/s offered by Gen 6 FC-NVME. Like FC-NVMe, Infiniband is a lossless transmission protocol, providing quality of service (QoS) mechanism, along with credit-based flow control.
Some vendors consider RDMA to be highly compatible with NVMe use cases due to their use of the same queueing structure. The main reason for using RDMA-based technologies is that command transfer does not require any kind of encapsulation and translation of commands as both use the similar queueing structure for data transfer without CPU intervention. This way RDMA saves CPU cycles, which lowers latency in data transmission from hosts to storage devices.
- With Fibre Channel, enterprises can preserve their existing hardware investment along with taking full advantage of complete NVMe-enabled storage infrastructure. But NVMe-oF implementations based on Infiniband, RDMA (iWARP or RoCE), and Ethernet often require new hardware resources for enterprises.
- Fibre Channel fabric has a flow control “buffer-to-buffer credit” feature with which it assures the quality of service (QoS) for enterprises by providing lossless network traffic. RDMA Ethernet (iWARP and RoCE) require additional protocol support to enable this feature.
- As compared to other network fabric options, Fibre Channel requires less configuration to initiate network traffic.
- Fibre Channel fabric has a feature to automatically discover and add host initiator and target storage devices and their properties. RDMA Ethernet (iWARP and RoCE) and Infiniband lack this capability.
As per a 2016 NVMe ecosystem market sizing report published by G2M Research, the NVMe market will be worth more than $57 billion by 2020, and more than 50% of enterprise servers will have NVMe-enabled by 2020.
NVMe over Fabrics takes the NVMe boost to a network, providing efficient, reliable and highly agile storage networks to be used for advanced use cases like artificial intelligence/machine learning, IoT, real-time analytics, and mission-critical applications.
But enterprises have to evaluate their investment capabilities based on different kinds of NVMe-oF implementations. RDMA offers advantages which are suited for advanced use cases (considering real-time access to storage), but enterprises can also leverage FC-NVMe by transitioning to the Gen 6 version which offers the highest data transfer speed with low latency.
In upcoming years, NVMe integration will be crucial for enterprises that are transitioning their IT infrastructure ecosystem for digital transformation.
Latest posts by Sagar Nangare (see all)
- Demystifying Persistent Storage Myths for Stateful Workloads in Kubernetes - October 18, 2019
- How is Kubernetes Leading the Game in Enabling NFV for Cloud Native? - September 10, 2019
- Analysis of Kubernetes and OpenStack Combination for Modern Data Centers - September 6, 2019