Overview

Welcome to the High-Performance Big Data project created by the Network-Based Computing Laboratory of The Ohio State University. As of Apr '15, more than 3,050 downloads have taken place from this project's site. The HiBD project contains the following packages:


RDMA-based Apache Hadoop 2.x (RDMA-Hadoop-2.x)

The RDMA for Apache Hadoop package is a derivative of Apache Hadoop. This package can be used to exploit performance on modern clusters with RDMA-enabled interconnects for Big Data applications. Major features of this package include:

  • Based on Apache Hadoop 2.6.0
  • Compliant with Apache Hadoop 2.6.0 APIs and applications
  • High performance design with native InfiniBand and RoCE support at the verbs-level for HDFS, MapReduce, and RPC components
  • Enhanced hybrid HDFS design with in-memory and heterogeneous storage (HHH)
    • Supports three modes (default, in-memory, and Lustre-integrated) of operations
    • Policies to efficiently utilize heterogeneous storage devices (RAM Disk, SSD, HDD, and Lustre)
    • Hybrid replication (in-memory and persistent storage) for HHH default mode
    • Memory replication (in-memory only with lazy persistence) for HHH in-memory mode
    • Lustre-based fault-tolerance for HHH Lustre-integrated mode
  • High performance design of MapReduce over Lustre
    • Supports two shuffle approaches (Lustre read and RDMA)
    • Hybrid shuffle based on both shuffle approaches
  • Easily configurable for different running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre) and different protocols (native InfiniBand, RoCE, and IPoIB)
A complete set of features and supported platforms can be found here..

RDMA-based Apache Hadoop 1.x (RDMA-Hadoop-1.x)

The RDMA for Apache Hadoop package is a derivative of Apache Hadoop. This package can be used to exploit performance on modern clusters with RDMA-enabled interconnects for Big Data applications. Major features of this package include:

  • Based on Apache Hadoop 1.2.1
  • Compliant with Apache Hadoop 1.2.1 APIs and applications
  • High performance design with native InfiniBand and RoCE support at the verbs-level for HDFS, MapReduce, and RPC components
  • Easily configurable for native InfiniBand, RoCE and the traditional sockets-based support (Ethernet and InfiniBand with IPoIB)

RDMA-based Memcached (RDMA-Memcached)

The RDMA for Memcached/libMemcached package is a derivative of Memcached/libMemcached. This package can be used to exploit performance on modern clusters with RDMA-enabled interconnects for Memcached-based applications. Major features of this package include:

  • Based on Memcached 1.4.22 and libMemcached 1.0.18
  • Compliant with libMemcached APIs and applications
  • High performance design with native InfiniBand and RoCE support at the verbs-level for Memcached Server and Client
  • High performance design of SSD-assisted hybrid memory
  • Support for both RDMA-enhanced and socket-based Memcached clients
  • Easily configurable for native InfiniBand, RoCE and the traditional sockets-based support (Ethernet and InfiniBand with IPoIB)

OSU HiBD-Benchmarks (OHB)

The OSU HiBD-Benchmarks project aims at developing benchmarks for evaluating Big Data middleware. The current version (0.7.1) of OHB consists of microbenchmarks for Memcached.

Announcements


RDMA-Apache-Hadoop-2.x 0.9.6 (based on Apache Hadoop 2.6.0) with hybrid HDFS design with in-memory and heterogeneous storage (HHH), MapReduce over Lustre, easily configurable for different running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre) and different protocols (native InfiniBand, RoCE, and IPoIB) is available. [more]

RDMA-Memcached 0.9.3 (based on Memcached 1.4.22 and libMemcached 1.0.18) with native InfiniBand and RoCE support, high performance design of SSD-assisted hybrid memory, and easy configuration for InfiniBand-RDMA, RoCE and sockets is available. [more]

High Performance Big Data Computing (HPBDC) International Workshop is soliciting technical papers. Workshop Details and Paper Submission Instructions.

Tutorial on Big Data, RDMA for Apache Hadoop, Spark and Memcached to be presented at HPCA 2015, ASPLOS 2015, and ISCA 2015 conferences.

OSU HiBD Benchmarks (OHB) 0.7.1 with Memcached Benchmarks (Get, Set and Mixed Get/Set) are available. [more]

HiBD in the News