We compare our package with default Hadoop MapReduce running over default HDFS. For detailed configuration and setup, please refer to our userguide.
TestDFSIO
TestDFSIO Throughput
Experimental Testbed: Each node in OpenPOWER Cluster has 20 cores (each core has 8 SMT threads) POWER8 8335-GTA processors at 3491 MHz and 256GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.2.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
For TestDFSIO throughput experiment, RDMA-IB design has an improvement of 2.18x - 2.26x compared to IPoIB (100Gbps).
Sort
Sort Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster has 20 cores (each core has 8 SMT threads) POWER8 8335-GTA processors at 3491 MHz and 256GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.2.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
The RDMA-IB design improves the job execution time of Sort by a maximum of 55% compared to IPoIB (100Gbps).
TeraSort
TeraSort Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster has 20 cores (each core has 8 SMT threads) POWER8 8335-GTA processors at 3491 MHz and 256GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.2.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
The RDMA-IB design improves the job execution time of TeraSort by a maximum of 21% compared to IPoIB (100Gbps).