We compare our package with default Spark. For detailed configuration and setup, please refer to our userguide.
SortBy
SortBy Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster 1 has 20 cores (each core has 8 SMT threads) POWER8 8335-GTA processors at 3491 MHz and 256GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.2.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
For SortBy experiment, RDMA-IB design has a maximum improvement of 18% compared to IPoIB (100Gbps).
GroupBy
Groupby Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster 2 has 20 cores (each core has 8 SMT threads) POWER8 8335-GTA processors at 3491 MHz and 256GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.2.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
The RDMA-IB design improves the job execution time of GroupBy by a maximum of 11% compared to IPoIB (100Gbps).
TeraSort
TeraSort Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster 2 has 20 cores (each core has 8 SMT threads) POWER8NVL processors at 2 GHz and 64GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.0.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
The RDMA-IB design improves the job execution time of TeraSort by a maximum of 35% compared to IPoIB (100Gbps).
Sort
Sort Execution Time
Experimental Testbed: Each node in OpenPOWER Cluster 2 has 20 cores (each core has 8 SMT threads) POWER8NVL processors at 2 GHz and 64GB RAM. The nodes are equipped with Mellanox ConnectX-4 EDR HCAs. The operating system used is Red Hat Enterprise Linux Server release 7.0.
These experiments are performed in 5 DataNodes with a total of 60 maps. HDFS block size is kept to 256MB. Each NodeManager is configured to run with 12 concurrent containers assigning a minimum of 4GB memory per container. The NameNode runs in a different node of the Hadoop cluster..
The RDMA-IB design improves the job execution time of Sort by a maximum of 25% compared to IPoIB (100Gbps).