In MapReduce over Lustre mode, we illustrate on two different configurations that we can use to evaluate this setup. Lustre is used to provide both input and output data directories. However, for intermediate data, either local disks or Lustre itself can be used. We evaluate our package with both setup and provide performance comparisons here. For detailed configuration and setup, please refer to our userguide.

MapReduce over Lustre (with local disks)

TeraSort Execution Time

terasort comet time

Sort Execution Time

sort comet time

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. Each node has 64GB of RAM disk capacity. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed in 16 nodes with a total of 64 maps and 32 reducers for TeraSort. Sort uses 28 reducers. Lustre stripe size is configured to 256 MB, which is equivalent to file system block size used in MapReduce. SSD inside each node is configured as intermediate data directory. Each NodeManager is configured to run with 6 concurrent containers assigning a minimum of 1.5 GB memory per container. The RDMA-IB design improves the job execution time of Sort by 40% over IPoIB (56Gbps). The maximum performance improvement for TeraSort is 42%.


MapReduce over Lustre (w/o local disks)

TeraSort Execution Time

terasort comet time

Sort Execution Time

sort comet time

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. Each node has 64GB of RAM disk capacity. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are performed in 16 nodes with a total of 64 maps and 32 reducers for TeraSort. Sort uses 28 reducers. Lustre stripe size is configured to 256 MB, which is equivalent to file system block size used in MapReduce. Lustre is configured as both input/output and intermediate data directory. Each NodeManager is configured to run with 6 concurrent containers assigning a minimum of 1.5 GB memory per container. The RDMA-IB design improves the job execution time of Sort by a maximum of 45% over IPoIB (56Gbps). The maximum performance improvement for TeraSort is achieved as 45%.

MapReduce over Lustre (with local disks)

TeraSort Execution Time

terasort ri time

Sort Execution Time

sort ri time

Experimental Testbed: Each node in OSU-RI2 has two fourteen Core Xeon E5-2680v4 processors at 2.4 GHz and 512 GB main memory. The nodes support 16x PCI Express Gen3 interfaces and are equipped with Mellanox ConnectX-4 EDR HCAs with PCI Express Gen3 interfaces. The operating system used is CentOS 7.

These experiments are performed in 8 nodes with a total of 32 maps and 16 reducers for TeraSort. Sort uses 14 reducers. Lustre stripe size is configured to 256 MB, which is equivalent to file system block size used in MapReduce. SSD inside each node is configured as intermediate data directory. Each NodeManager is configured to run with 6 concurrent containers assigning a minimum of 1.5 GB memory per container. The RDMA-IB design improves the job execution time of Sort by 22% over IPoIB (100Gbps). The maximum performance improvement for TeraSort is 16%.


MapReduce over Lustre (w/o local disks)

TeraSort Execution Time

terasort ri time

Sort Execution Time

sort ri time

Experimental Testbed: Each node in OSU-RI2 has two fourteen Core Xeon E5-2680v4 processors at 2.4 GHz and 512 GB main memory. The nodes support 16x PCI Express Gen3 interfaces and are equipped with Mellanox ConnectX-4 EDR HCAs with PCI Express Gen3 interfaces. The operating system used is CentOS 7.

These experiments are performed in 8 nodes with a total of 32 maps and 16 reducers for TeraSort. Sort uses 14 reducers. Lustre stripe size is configured to 256 MB, which is equivalent to file system block size used in MapReduce. SSD inside each node is configured as intermediate data directory. Each NodeManager is configured to run with 6 concurrent containers assigning a minimum of 1.5 GB memory per container. The RDMA-IB design improves the job execution time of Sort by 14% over IPoIB (100Gbps). The maximum performance improvement for TeraSort is 14%.