Better implementation of MPI_Allreduce for intercommunicator.
The patch provides a better implementation where each group does a
intra-reduce concurrently, exchanges the reduce result, and broadcasts
in the group. The current implementation had a problem of serializing
intra-reduces of each group.
Fixes ticket #2074.
Signed-off-by:
Sangmin Seo <sseo@anl.gov>
Please register or sign in to comment