RDMA for Apache Hadoop 2.x Changelog ------------------------------------- This file briefly describes the changes to the RDMA for Apache Hadoop 2.x software package. The logs are arranged in the "most recent first" order. Release 1.3.5 - 2018-04-11 NEW FEATURES - Support for Containers (Docker and Singularity) - Tested with - Docker 18.03.0-ce - Singularity 2.4.6 BUG FIXES - Fix an issue when compression is enabled during shuffle - Thanks to Brian Panneton@US ARL for reporting the issue - Fix an issue when multi-disks are used - Thanks to Meng Wang@Inspur for reporting the issue - Fix an issue when users try to kill applications - Thanks to Meng Wang@Inspur for reporting the issue Release 1.3.0 - 2017-11-10 NEW FEATURES - Initial support for POWER architecture - Performance optimization and tuning on OpenPOWER cluster - Tested with - Various multi-core platforms (e.g., x86, POWER) - OpenJDK and IBM JDK Release 1.2.0 - 2017-07-28 NEW FEATURES - Based on Apache Hadoop 2.8.0 - Compliant with Apache Hadoop 2.8.0 APIs and applications - Support for RDMA Device Selection BUG FIXES - Fix an issue for TeraValidate running until 33% Reduce Phase - Thanks to Vittorio Rebecchi@A3Cube Inc. for reporting the issue Release 1.1.0 - 2016-11-07 NEW FEATURES - Based on Apache Hadoop 2.7.3 - Plugin for Apache Hadoop distribution (tested with 2.7.3) - Plugin for Hortonworks Data Platform (HDP) (tested with 2.5.0.3) - Plugin for Cloudera Distribution Including Apache Hadoop (CDH) (tested with 5.8.2) - Compliant with Apache Hadoop 2.7.3, Hortonworks Data Platform (HDP) 2.5.0.3, and Cloudera Distribution Including Apache Hadoop (CDH) 5.8.2 APIs and applications - Support for priority local directory selection in MapReduce Shuffle BUG FIXES - Fix an issue for removing data from Lustre for HHH-L mode - Thanks to Yongxin Xin@Inspur for reporting the issue Release 1.0.0 - 2016-08-24 NEW FEATURES - Memcached-based burst buffer for MapReduce over Lustre-integrated HDFS (HHH-L-BB mode) - Optimization of in-memory spill for Maps - Support for Mellanox EDR HCA BUG FIXES - Fix an issue for running with Apache Spark benchmarks - Fix a hang issue in HHH-L mode - Fix an issue for running with HBase in TCP/IP mode - Fix a hang issue in HHH mode Release 0.9.9 - 2016-03-18 NEW FEATURES - Compliant with Apache Hadoop 2.7.1, Hortonworks Data Platform (HDP) 2.3.0.0, and Cloudera Distribution Including Apache Hadoop (CDH) 5.6.0 APIs and applications - Plugin for Cloudera Distribution Including Apache Hadoop (CDH) (tested with 5.6.0) - Automatic Locality-aware Shuffle in MapReduce BUG FIXES - Fix an issue for running with Apache HBase 1.1.2 - Fix an issue for running with multiple DataNode processes per node Release 0.9.8 - 2015-09-28 NEW FEATURES - Based on Apache Hadoop 2.7.1 - Compliant with Apache Hadoop 2.7.1 and Hortonworks Data Platform (HDP) 2.3.0.0 APIs and applications - Plugin for Apache Hadoop distribution (tested with 2.7.1) - Plugin for Hortonworks Data Platform (HDP) (tested with 2.3.0.0) Release 0.9.7 - 2015-05-26 NEW FEATURES - Plugin-based architecture supporting RDMA-based designs for HDFS (HHH, HHH-M, HHH-L), MapReduce, MapReduce over Lustre and RPC, etc., as available in 0.9.6 version - Plugin for Apache Hadoop distribution (tested with 2.6.0) - Plugin for Hortonworks Data Platform (HDP) (tested with 2.2.0.0) - Supports deploying Hadoop with Slurm and PBS in different running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre) Release 0.9.6 - 2015-03-23 NEW FEATURES - Based on Apache Hadoop 2.6.0 - Compliant with Apache Hadoop 2.6.0 APIs and applications - Easily configurable for different running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre) - Enhanced hybrid HDFS design with in-memory and heterogeneous storage (HHH) - Supports three modes of operations - HHH (default) with I/O operations over RAM disk, SSD, and HDD - HHH-M (in-memory) with I/O operations in-memory - HHH-L (Lustre-integrated) with I/O operations in local storage and Lustre - Policies to efficiently utilize heterogeneous storage devices (RAM Disk, SSD, HDD, and Lustre) - Greedy and Balanced policies support - Automatic policy selection based on available storage types - Hybrid replication (in-memory and persistent storage) for HHH default mode - Memory replication (in-memory only with lazy persistence) for HHH-M mode - Lustre-based fault-tolerance for HHH-L mode - No HDFS replication - Reduced local storage space usage - High performance design of MapReduce over Lustre - Supports two shuffle approaches - Lustre read based shuffle - RDMA based shuffle - Hybrid shuffle based on both shuffle approaches - Configurable distribution support - In-memory merge and overlapping of different phases - Tested with - Mellanox InfiniBand adapters (DDR, QDR, and FDR) - RoCE support with Mellanox adapters - Various multi-core platforms - RAM Disks, SSDs, HDDs, and Lustre BUG FIXES - Fix a hang issue in running with WordCount-like benchmarks - Thanks to Amit Sangroya@TCS for reporting the issue - Fix an issue for NameNode running with HA enabled mode - Thanks to Qihu Yang@AsiaInfo for reporting the issue Release 0.9.5 - 2014-11-26 NEW FEATURES - Based on Apache Hadoop 2.5.0 - High performance design with native InfiniBand and RoCE support at the verbs level for HDFS, MapReduce, and RPC components - Compliant with Apache Hadoop 2.5.0 APIs and applications - Easily configurable for native InfiniBand, RoCE, and the traditional sockets based support (Ethernet and InfiniBand with IPoIB) - On-demand connection setup - HDFS over native InfiniBand and RoCE - RDMA-based write - RDMA-based replication - Parallel replication support - Optimizations for SSD - MapReduce over native InfiniBand and RoCE - RDMA-based shuffle - Prefetching and caching of map outputs - In-memory merge - Advanced optimization in overlapping - map, shuffle, and merge - shuffle, merge, and reduce - Optional disk-assisted shuffle - RPC over native InfiniBand and RoCE - JVM-bypassed buffer management - RDMA or send/recv based adaptive communication - Intelligent buffer allocation and adjustment for serialization - Tested with - Mellanox InfiniBand adapters (DDR, QDR, and FDR) - RoCE support with Mellanox adapters - Various multi-core platforms - Different file systems with disks and SSDs Release 0.9.1 - 2014-08-18 NEW FEATURES - Based on Apache Hadoop 2.4.1 - High performance design with native InfiniBand and RoCE support at the verbs level for the HDFS component - Compliant with Apache Hadoop 2.4.1 APIs and applications - Easily configurable for native InfiniBand, RoCE, and the traditional sockets based support (Ethernet and InfiniBand with IPoIB) - On-demand connection setup - HDFS over native InfiniBand and RoCE - RDMA-based write - RDMA-based replication - Parallel replication support - Optimizations for SSD - Tested with - Mellanox InfiniBand adapters (DDR, QDR, and FDR) - RoCE support with Mellanox adapters - Various multi-core platforms - Different file systems with disks and SSDs