- 13 Jan, 2011 8 commits
-
-
Darius Buntinas authored
[svn-r7720] Fix collectives to not hang if the communicator contains a failed process. The collectives will not return an error immediately upon detecting a failure, rather they'll return the error at the end of the function and continue the communication pattern so that other processes waiting to receive messages will not hang. This means that, although the collective should complete at all processes, some processes will receive an error, and some processes may not get a valid result. Since some processes may not receive an error and still receive an invalid result, a separate mechanism is needed to confirm that the collective has completed correctly, such as MPI_Comm_validate of the MPI3 FT proposal.
-
Jayesh Krishna authored
-
Jayesh Krishna authored
-
Pavan Balaji authored
-
Pavan Balaji authored
-
Pavan Balaji authored
at all. We just change to the base directory and run the executable.
-
Pavan Balaji authored
processes. If we forked out the MPI processes (PMI_FD format), we preload the PMI FDs as well. If we are using a PMI PORT, we wait for the application to send a PMI initialization message before loading the PMI FD.
-
Pavan Balaji authored
-
- 12 Jan, 2011 9 commits
-
-
William Gropp authored
[svn-r7705] Updated the test on the nameserver choice to be compatible with the new default; remove reference to long-deleted ldap nameserver
-
Darius Buntinas authored
[svn-r7702] Make use of the atomic increment of the completion_count to bump the progress engine when a checkpoint is initiated of a failed process is detected. Also jump out of blocking_recv if the completion_count is bumped.
-
Darius Buntinas authored
yet, we still may have posted recvs; we need to look for those in a failure case. Similarly, we may have sends in the send queue even if the connection hasn't been fully established.
-
Rob Latham authored
library. if for some reason we need the MPI implementation, use --with-mpi-impl (but it looks like that historical flag doesn't actually do anything any longer
-
Pavan Balaji authored
used is to chdir to the directory where the executable is located and launch the program from there. Fixes ticket #1160.
-
Pavan Balaji authored
we are still not handling the case where a process running in the background might look for STDIN (in which case it should get an EOF). That's a random corner case, which we don't support right now.
-
Pavan Balaji authored
socket. This makes STDIN management independent of which bootstrap server we use. This provides the necessary setup to enable ticket
-
Pavan Balaji authored
proxies. But even if that's not successful, mpiexec should still exit.
-
Pavan Balaji authored
-outfile-pattern and -errfile-pattern.
-
- 11 Jan, 2011 7 commits
-
-
David Goodell authored
This fixes ticket #1122. This fix hasn't been comprehensively tested, but it has worked fine everywhere that I've tried it. No reviewer.
-
Darius Buntinas authored
[svn-r7687] Changed global completion count to use atomics rather than locks. This will allow us to update the completion count asynchronously from other threads or interrupt contexts
-
David Goodell authored
Prevents duplicate storage on platforms like darwin that don't support weak symbols and therefore have two libraries in order to support the PMPI profiling interface. Previously, there was also a copy in libpmpich.a, which contains the actual "MPI_" versions of the routines. No reviewer.
-
Darius Buntinas authored
-
David Goodell authored
In the old code it was possible that ROMIO freed the keyval multiple times, rather than just once. This was fine in MPICH2, which is robust in the face of such behavior, but caused problems for ROMIO over Open MPI. We now utilize the MPI-2.2 LIFO ordering for attribute destruction in order to free the keyval exactly once. Reported by Pascal Deveze and related to ticket #222. Reviewed by robl@.
-
Darius Buntinas authored
-
David Goodell authored
This is more natural to users because it behaves just like printf escape specifiers. Reviewed by balaji@.
-
- 10 Jan, 2011 3 commits
-
-
David Goodell authored
This was a feature request from Quincey in order to improve the effectiveness of the HDF5 test suite. No reviewer.
-
Darius Buntinas authored
-
David Goodell authored
This builds on r7653 and r7654. No reviewer.
-
- 06 Jan, 2011 3 commits
-
-
Darius Buntinas authored
-
Darius Buntinas authored
[svn-r7671] merged in error-return branch. This branch completes with an error recvs on failed VCs or anysource recvs on comms with failed VCs. See log for r7669 for details.
-
David Goodell authored
No reviewer.
-
- 05 Jan, 2011 1 commit
-
-
Darius Buntinas authored
-
- 04 Jan, 2011 3 commits
-
-
David Goodell authored
Bill was sharp and caught my mistake, which didn't entirely replace sizeof(void*) with align_sz. No reviewer.
-
David Goodell authored
Reviewed by buntinas@.
-
David Goodell authored
Thanks to Cray for reporting these issues. No reviewer.
-
- 03 Jan, 2011 4 commits
-
-
Pavan Balaji authored
#1131.
-
Pavan Balaji authored
expression to be prepended to the output. -prepend-rank is a subset of this generalized capability.
-
Pavan Balaji authored
1. Move UI specific parameters to a different structure, so other components don't see them. 2. Rename HYD_handle to HYD_server_info to make the data in it clearer. 3. Rename hydra.h to hydra_server.h (this is only used by the server side). Also make sure the proxy side code does not include it. 4. Merge hydra_base.h and hydra_utils.h and make it hydra.h, as these are the utility functions used for both the servers and the proxies.
-
Pavan Balaji authored
1. Point out that C++, F77 and F90 support is optional and the users can disable them. 2. Remove the recommendation for VPATH builds. While this is a convenient option for system administrators or developers who maintain multiple copies of MPICH2 builds, it is confusing for users. 3. Remove redundant description of configure options. 4. Remove the MPICH_INTERFACE_HOSTNAME environment variable from the README, as in the default PM, we automatically pass it based on the host list provided by the user. 5. Cleaned up several descriptions.
-
- 30 Dec, 2010 2 commits
-
-
Pavan Balaji authored
ticket #1141.
-
Pavan Balaji authored
used at several places now.
-