HiBench PageRank Execution Time on 32 Worker Nodes / 768 Cores

sort ri hdd

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed with full subscription utilization of Cores so that on 32 Worker Nodes jobs run with total of 768 maps and 768 reduces. Spark is run in Standalone mode. Configuration used is spark_worker_memory=96GB, spark_worker_cores=24, spark_executor_instances=1. SSD is used for Spark Local and Work data.

The RDMA-IB design improves the average job execution time of HiBench PageRank on 32 Worker Nodes by 34% with 44% highest improvement compared to IPoIB (56Gbps). The highest improvement is seen for Gigantic datasize.