There are two Kafka micro-benchmarks: (1) Producer based Latency benchmark and (2) Producer based Throughput benchmark. For detailed configuration and setup, please refer to our userguide.

Latency and Throughput

Average Latency

swl

Aggregate Throughput

srl

Experimental Testbed: Each compute node in this cluster has two twelve-core Intel Xeon E5-2680v3 processors, 128GB DDR4 DRAM, and 320GB of local SSD with CentOS operating system. Each node has 64GB of RAM disk capacity. The network topology in this cluster is 56Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth.

These experiments are run with 1 broker and a variable number of producers, each on a different node. For each experiment, the message size is set to be 1000 bytes. The RDMA-Kafka design improves the latency and throughput by up to 33% and 1.7x over IPoIB (56Gbps), respectively.