Design and Optimization of GPU-Aware MPI Allreduce Using Direct Sendrecv Communication C. Chen, J. Yao, H. Subramoni, D. Panda NVIDIA GTC AI Conference 2026, Mar 2026.