HiBench Sort Execution Time on 32 Worker Nodes / 768 Cores

sort ri hdd

HiBench TeraSort Execution Time on 32 Worker Nodes / 768 Cores

terasort ri hdd

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed on 32 Worker Nodes with a total of 768 maps and 768 reduces so that the job is run with full subscription on 768 Cores in the cluster. Spark is run in Standalone mode. Configuration used is spark_worker_memory=96GB, spark_worker_cores=24, spark_executor_instances=1, spark_driver_memory=2GB. SSD is used for Spark Local and Work data.

The RDMA-IB design improves the average job execution time of HiBench Sort by 20%-25% and HiBench TeraSort by 14%-16% compared to IPoIB (56Gbps).


HiBench Sort Execution Time on 8 Worker Nodes / 192 Cores

sort ri ssd

HiBench TeraSort Execution Time on 8 Worker Nodes / 192 Cores

terasort ri ssd

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed on 8 Worker Nodes with a total of 192 maps and 192 reduces so that the job is run with full subscription on 192 Cores in the cluster. Configuration used is spark_worker_memory=96GB, spark_worker_cores=24, spark_executor_instances=1, spark_driver_memory=2GB. SSD is used for Spark Local and Work data. We run RDMA Spark in two modes: (1) with default HDFS (Apache Hadoop 2.7.3), and, (2) with RDMA-enhanced Heterogeneous HDFS or HHH (RDMA-based Apache Hadoop 2.x).

The RDMA-IB design improves the average job execution time of HiBench Sort by 20%-29% and HiBench TeraSort by 12%-14% compared to IPoIB (56Gbps). The RDMA-HHH-IB improves the average job execution time of HiBench Sort by 71% and HiBench TeraSort by 18%-25% compared to IPoIB (56Gbps).