MPI4Spark 0.2 Features

  • Based on Apache Spark 3.3.0
  • (NEW) Support for the YARN cluster manager
  • Compliant with user-level Apache Spark APIs and packages
  • High performance design that utilizes MPI-based communication
    • Utilizes MPI point-to-point operations
    • Relies on MPI Dynamic Process Management (DPM) features for launching executor processes
    • (NEW) Relies on the Multiple-Program-Multiple-Data (MPMD) MPI launcher mode for launching executors when using the YARN cluster manager
  • Built on top of the MVAPICH2-J Java bindings for MVAPICH2 family of MPI libraries
  • Tested with
    • (NEW) OSU HiBD-Benchmarks, GroupBy and SortBy
    • (NEW) Intel HiBench Suite, Micro Benchmarks, Machine Learning and Graph Workloads
    • Mellanox InfiniBand adapters (EDR and HDR 100G and 200G)
    • HPC systems with Intel OPA and Cray Slingshot interconnects
    • Various multi-core platforms

MPI4Dask 0.3 Features

  • (NEW) Based on Dask Distributed 2022.8.1
  • Compliant with user-level Dask APIs and packages
  • Support for MPI-based communication for GPU-based Dask applications
  • Implements point-to-point communication co-routines
  • Efficient chunking mechanism implemented for large messages
  • Built on top of mpi4py over MVAPICH2, MVAPICH2-X, and MVAPICH2-GDR libraries
  • Support for MPI-based communication for CPU-based Dask applications
  • Supports starting execution of Dask programs using Dask-MPI
  • Tested with
    • (NEW) Mellanox InfiniBand adapters (FDR, EDR, and HDR)
    • (NEW) Various multi-core platforms
    • (NEW) Various benchmarks used by the community (MatMul, Slicing, Sum Transpose, cuDF Merge, etc.)
    • (NEW) Scaling out to hundreds of workers (processes)
    • (NEW) NVIDIA V100 and A100 GPUs

RDMA-Apache-Hadoop-3.x 0.9.1 Features

  • Based on Apache Hadoop 3.0.0
  • Compliant with Apache Hadoop 3.0.0 APIs and applications
  • Support for RDMA Device Selection
  • High performance design with native InfiniBand and RoCE support at the verbs level for HDFS component
  • Supports deploying Hadoop with Slurm and PBS in different running modes (HHH and HHH-M)
  • Easily configurable for different running modes (HHH and HHH-M) and different protocols (native InfiniBand, RoCE, and IPoIB)
  • On-demand connection setup
  • HDFS over native InfiniBand and RoCE
    • RDMA-based write
    • RDMA-based replication
    • Overlapping in different stages of write and replication
    • Enhanced hybrid HDFS design with in-memory and heterogeneous storage (HHH)
      • Supports two modes of operations
        • HHH (default) with I/O operations over RAM disk, SSD, and HDD
        • HHH-M (in-memory) with I/O operations in-memory
      • Policies to efficiently utilize heterogeneous storage devices (RAM Disk, SSD, and HDD)
        • Greedy and Balanced policies support
        • Automatic policy selection based on available storage types
      • Hybrid replication (in-memory and persistent storage) for HHH default mode
      • Memory replication (in-memory only with lazy persistence) for HHH-M mode
  • Tested with
    • Mellanox InfiniBand adapters (QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • RAM Disks, SSDs, and HDDs
    • OpenJDK and IBM JDK

RDMA-Apache-Spark 0.9.5 Features

New features and enhancements compared to 0.9.4 release are marked as (NEW).

  • Based on Apache Spark 2.1.0
  • (NEW) Built with Apache Hadoop 2.8.0
  • (NEW) Initial support for POWER architecture
  • (NEW) Performance optimization and tuning on OpenPOWER clusters
  • (NEW) Support for various communication thread binding policies
  • (NEW) Support for RDMA Device Selection
  • High performance design with native InfiniBand and RoCE support at the verbs-level for Spark
    • RDMA-based data shuffle
    • SEDA-based shuffle architecture
    • Support pre-connection, on-demand connection, and connection sharing
    • Non-blocking and chunk-based data transfer
    • Off-JVM-heap buffer management
  • Compliant with Apache Spark 2.1.0 APIs and applications
  • RDMA support for Spark SQL
  • Integration with HHH in RDMA for Apache Hadoop
  • Easily configurable for native InfiniBand, RoCE, and the traditional sockets based support (Ethernet and InfiniBand with IPoIB)
  • Tested with
    • (NEW) Various multi-core platforms (e.g., x86, POWER)
    • (NEW) OpenJDK and IBM JDK
    • Mellanox InfiniBand adapters (DDR, QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms
    • RAM Disks, SSDs, and HDDs

RDMA-Apache-Kafka 0.9.1 Features

  • Based on Apache Kafka 1.0.0
  • High performance design with native InfiniBand and RoCE support at the verbs-level for Apache Kafka
  • Compliant with Apache Kafka 1.0.0 APIs and applications
  • Easily configurable for native InfiniBand, RoCE, and the traditional sockets based support (Ethernet and InfiniBand with IPoIB)
  • On-demand connection setup
  • Support for RDMA Device Selection
  • Tested with
    • Mellanox InfiniBand adapters (DDR, QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms

RDMA-Apache-Hadoop-2.x 1.3.5 Features

New features and enhancements compared to 1.3.0 release are marked as (NEW).

  • Based on Apache Hadoop 2.8.0
  • Compliant with Apache Hadoop 2.8.0 APIs and applications
  • (NEW) Support for Containers (Docker and Singularity)
  • Initial support for POWER architecture
  • Performance optimization and tuning on OpenPOWER cluster
  • Support for RDMA Device Selection
  • Hortonworks Data Platform (HDP) 2.5.0.3, and Cloudera Distribution Including Apache Hadoop (CDH) 5.8.2 APIs and applications
  • High performance design with native InfiniBand and RoCE support at the verbs level for HDFS, MapReduce, and RPC components
  • Plugin-based architecture supporting RDMA-based designs for HDFS (HHH, HHH-M, HHH-L, and HHH-L-BB), MapReduce, MapReduce over Lustre and RPC, etc.
    • Plugin for Cloudera Distribution Including Apache Hadoop (CDH) (tested with 5.8.2)
    • Plugin for Apache Hadoop distribution (tested with 2.7.3)
    • Plugin for Hortonworks Data Platform (HDP) (tested with 2.5.0.3)
  • Supports deploying Hadoop with Slurm and PBS in different running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre)
  • Easily configurable for different running modes (HHH, HHH-M, HHH-L, HHH-L-BB, and MapReduce over Lustre) and different protocols (native InfiniBand, RoCE, and IPoIB)
  • On-demand connection setup
  • HDFS over native InfiniBand and RoCE
    • RDMA-based write
    • RDMA-based replication
    • Parallel replication support
    • Overlapping in different stages of write and replication
    • Enhanced hybrid HDFS design with in-memory and heterogeneous storage (HHH)
      • Supports four modes of operations
        • HHH (default) with I/O operations over RAM disk, SSD, and HDD
        • HHH-M (in-memory) with I/O operations in-memory
        • HHH-L (Lustre-integrated) with I/O operations in local storage and Lustre
        • HHH-L-BB (Burst Buffer) with I/O operations in Memcached-based burst buffer (RDMA-based Memcached) over Lustre
      • Policies to efficiently utilize heterogeneous storage devices (RAM Disk, SSD, HDD, and Lustre)
        • Greedy and Balanced policies support
        • Automatic policy selection based on available storage types
      • Hybrid replication (in-memory and persistent storage) for HHH default mode
      • Memory replication (in-memory only with lazy persistence) for HHH-M mode
      • Lustre-based fault-tolerance for HHH-L mode
        • No HDFS replication
        • Reduced local storage space usage
  • MapReduce over native InfiniBand and RoCE
    • RDMA-based shuffle
    • Prefetching and caching of map output
    • In-memory merge
    • Advanced optimization in overlapping
      • map, shuffle, and merge
      • shuffle, merge, and reduce
    • Optional disk-assisted shuffle
    • Automatic Locality-aware Shuffle
    • Optimization of in-memory spill for Maps
    • High performance design of MapReduce over Lustre
      • Supports two shuffle approaches
        • Lustre read based shuffle
        • RDMA based shuffle
      • Hybrid shuffle based on both shuffle approaches
        • Configurable distribution support
      • In-memory merge and overlapping of different phases
  • Support for priority-based local directory selection in MapReduce Shuffle
  • RPC over native InfiniBand and RoCE
    • JVM-bypassed buffer management
    • RDMA or send/recv based adaptive communication
    • Intelligent buffer allocation and adjustment for serialization
  • Tested with
    • Mellanox InfiniBand adapters (DDR, QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms (e.g., x86, POWER)
    • RAM Disks, SSDs, HDDs, and Lustre
    • OpenJDK and IBM JDK
    • (NEW) Docker 18.03.0-ce
    • (NEW) Singularity 2.4.6

RDMA-Memcached 0.9.6 Features

New features and enhancements compared to 0.9.5 release are marked as (NEW).

  • (NEW) Based on Memcached 1.5.3
    • (NEW) Compliant with the Memcached’s new item chaining feature in In-Memory mode
    • (NEW) Compliant with the latest Memcached’s LRU maintainer and slab balancer enhancements
  • Based on libMemcached 1.0.18
  • High performance design with native InfiniBand and RoCE support at the verbs-level for Memcached Server and Client
  • High performance design of SSD-assisted hybrid memory
  • (NEW) Runtime selection of HCA device for nodes equipped with multiple InfiniBand/RocE HCAs
  • (NEW) Enable and disable item chaining through extended server options
  • Compliant with libMemcached 1.0.18 APIs and applications
  • Non-Blocking Libmemcached Set/Get API extensions
    • APIs to issue non-blocking set/get requests to the RDMA-based Memcached servers
    • APIs to support monitoring the progress of non-blocking requests issued in an asynchronous fashion
    • Facilitating overlap of concurrent set/get requests
  • Support for burst-buffer mode in Lustre-integrated design of HDFS in RDMA for Apache Hadoop-2.x
  • Support for both RDMA-enhanced and socket-based Memcached clients
  • Easily configurable for native InfiniBand, RoCE and the traditional sockets-based support (Ethernet and InfiniBand with IPoIB)
  • On-demand connection setup
  • Tested with
    • Native Verbs-level support with Mellanox InfiniBand adapters (QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms
    • SATA-SSD, PCIe-SSD, and NVMe-SSD

RDMA-Apache-Hadoop-1.x 0.9.9 Features

  • Based on Apache Hadoop 1.2.1
  • High performance design with native InfiniBand and RoCE support at the verbs-level for HDFS, MapReduce, and RPC components
  • Compliant with Apache Hadoop 1.2.1 APIs and applications
  • Easily configurable for native InfiniBand, RoCE, and the traditional sockets-based support (Ethernet and InfiniBand with IPoIB)
  • On-demand connection setup
  • HDFS over native InfiniBand and RoCE
    • RDMA-based write
    • RDMA-based replication
    • Parallel replication support
  • MapReduce over native InfiniBand and RoCE
    • RDMA-based shuffle
    • Prefetching and caching of map outputs
    • In-memory merge
    • Advanced optimization in overlapping
      • map, shuffle, and merge
      • shuffle, merge, and reduce
  • RPC over native InfiniBand and RoCE
    • JVM-bypassed buffer management
    • RDMA or send/recv based adaptive communication
    • Intelligent buffer allocation and adjustment for serialization
  • Tested with
    • Mellanox InfiniBand adapters (DDR, QDR, and FDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms
    • Different file systems with disks and SSDs

RDMA-Apache-HBase 0.9.1 Features

  • Based on Apache HBase 1.1.2
  • High performance design with native InfiniBand and RoCE support at the verbs-level for Apache HBase
  • Compliant with Apache HBase 1.1.2 APIs and applications
  • Easily configurable for native InfiniBand, RoCE, and the traditional sockets based support (Ethernet and InfiniBand with IPoIB)
  • On-demand connection setup
  • Tested with
    • Mellanox InfiniBand adapters (DDR, QDR, FDR, and EDR)
    • RoCE support with Mellanox adapters
    • Various multi-core platforms

OSU HiBD-Benchmarks 0.9.3 Features

New features and enhancements compared to 0.9.2 release are marked as (NEW) .

  • Micro-benchmarks for Hadoop Distributed File System (HDFS)
    • Sequential Write Latency (SWL) Benchmark
    • Sequential Read Latency (SRL) Benchmark
    • Random Read Latency (RRL) Benchmark
    • Sequential Write Throughput (SWT) Benchmark
    • Sequential Read Throughput (SRT) Benchmark
    • Support benchmarking
      • Apache Hadoop 1.x and 2.x HDFS
      • Hortonworks Data Platform (HDP) HDFS
      • Cloudera Distribution of Hadoop (CDH) HDFS
  • Micro-benchmarks for Memcached
    • Get Latency Benchmark
    • Set Latency Benchmark
    • Mixed Get/Set Latency Benchmark
    • Non-Blocking API Latency Benchmark
    • Hybrid Memory Latency Benchmark
    • (NEW) Yahoo! Cloud Serving Benchmark (YCSB) Extension for RDMA-Memcached
  • Micro-benchmarks for HBase
    • Get Latency Benchmark
    • Put Latency Benchmark
  • Micro-benchmarks for Spark
    • GroupBy
    • SortBy