RDMA-based Apache Hadoop 2.x (RDMA-Hadoop-2.x)


RDMA for Apache Hadoop 2.x is a high-performance design of Hadoop over RDMA-enabled Interconnects. This version of RDMA for Apache Hadoop 2.x 1.3.5 is based on Apache Hadoop 2.8.0 and is compliant with Apache Hadoop 2.8.0, Hortonworks Data Platform (HDP) 2.3.0.0, and Cloudera Distribution of Hadoop (CDH) 5.6.0 APIs and applications.

The figure below presents a high-level architecture of RDMA for Apache Hadoop 2.x. In this package, many different modes have been included that can be enabled/disabled to obtain performance benefits for different kinds of applications in different Hadoop environments. This package can be configured to run MapReduce jobs on top of HDFS as well as Lustre.

hadoop2-rdma

Following are the different modes that are included in our package.

MVAPICH2-based Apache Spark (MPI4Spark 0.3)


MPI4Spark v0.3 supports both the standalone and YARN cluster managers. This is done by relying on the Multiple-Program-Multiple-Data (MPMD) MPI launcher mode to launch executor processes. The architecture of MPI4Spark remains that same with the underlying MPI-based Netty at its core.

The following figures showcase the new YARN cluster manager design whereby only executor processes are launched in an MPI environment. The figure to the right highlights the implementation which relies on the linux shared file system. This is different than the standalone cluster manager design where all components of the Spark cluster are launched through the MPI launcher.

MPI4Spark+YARN Design

mpi4spark yarn design

MPI4Spark+YARN Implementation

mpi4spark yarn implementation

Step 1 is concerned with writing the paths of the YARN executor launcher scripts to a shared location visible to all NodeManagers. As mentioned earlier, This is done in the SPARK_HOME directory inside a sub-directory called mpmd_config. Step 2 executes the setup-and-run-yarn-mpmd.sh script which creates the mpmd.config and hostfile files, and step 3 launches the executor containers in their respective NodeManager nodes through MPI. Launching of the executors processes on their respective NodeManager nodes is taken care by the MPI launcher as long as the hostfile and mpmd.config file were properly generated.