SQL-Aggregation Execution Time on 8 Worker Nodes / 192 Cores

spark sql aggregation

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 80GB of local SSD with CentOS operating system. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed on 8 Worker Nodes to run the SQL aggregation query on two table that sum up to 4GB to 16GB. We use Apache Hive version 1.2.1 backed by PostgreSQL database as metastore. We use RDMA-enhanced Heterogenous HDFS design HHH as the underlying filesystem that stores the Hive tables. Spark is run in Standalone mode. Configuration used is spark_worker_memory=96GB, spark_worker_cores=24, spark_executor_instances=1. SSD is used for Spark Local and Work data.

The RDMA-IB design with HHH improves the job execution time of SQL-Aggregation by 27% - 41% compared to IPoIB (56Gbps).