Transformations to Parallel Codes for Communication-Computation Overlap

Author : Danalis, Anthony; Kim, Ki-Yong; Pollock, Lori; Swany, Martin
Booktitle : In Proceedings of IEEE/ACM Conference on High Performance Computing, Networking, Storage and Analysis 2005 (SC2005)
Date : Nov 2005
Publisher : IEEE/ACM
Keyword(s) : Parallel processing, Communication Computation Overlapping, Compiler optimization, RDMA
Document Type : In Conference Proceedings
BibTeX Entry : (show)

Abstract :

This paper presents program transformations directed toward improving communication-computation overlap in parallel programs that use MPI’s collective operations. Our transformations target a wide variety of applications focusing on scientific codes with computation loops that exhibit limited dependence among iterations. We include guidance for developers for transforming an application code in order to exploit the communication-computation overlap available in the underlying cluster, as well as a discussion of the performance improvements achieved by our transformations. We present results from a detailed study of the effect of the problem and message size, level of communication-computation overlap, and amount of communication aggregation on runtime performance in a cluster environment based on an RDMA-enabled network. The targets of our study are two scientific codes written by domain scientists, but the applicability of our work extends far beyond the scope of these two applications.

Paper Link

Presentation Link