• Darius Buntinas's avatar
    [svn-r7720] Fix collectives to not hang if the communicator contains a failed... · 3e71b782
    Darius Buntinas authored
    [svn-r7720] Fix collectives to not hang if the communicator contains a failed process.  The collectives will not return an error immediately upon detecting a failure, rather they'll return the error at the end of the function and continue the communication pattern so that other processes waiting to receive messages will not hang.  This means that, although the collective should complete at all processes, some processes will receive an error, and some processes may not get a valid result.  Since some processes may not receive an error and still receive an invalid result, a separate mechanism is needed to confirm that the collective has completed correctly, such as MPI_Comm_validate of the MPI3 FT proposal.
    3e71b782