- 12 Jan, 2015 2 commits
-
-
Wesley Bland authored
If we had a failure that caused a request to be pending, we were freeing the request before calling the error handler. That caused segfaults. Now we switch the ordering of the two to avoid that. This also moves the assignment of the status_ptr to be a little earlier to avoid another segfault. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Kenneth Raffenetti authored
CH3 ensures that self communication does not go through the netmod, so there is no need for a process to pause/unpause itself. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 08 Jan, 2015 2 commits
-
-
Su Huang authored
Signed-off-by:
Sameh Sharkawi <sssharka@us.ibm.com>
-
OpenMPI uses 'make dist', but MPICH does not. Some recently added (internal) header files were not listed in ROMIO's noinst declaration Note: RobL combined and edited these OpenMPI patches into this patch: - e0927895db8d - 84c41429e9ac Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 07 Jan, 2015 2 commits
-
-
Kenneth Raffenetti authored
Adding FCMODOUTFLAG directly to AM_FCFLAGS could cause conflicts with certain libtool flags (-module) during linking. This change allows us to set FCMODOUTFLAG during module creation, but not have it present during linking. Refs #2024 Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Kenneth Raffenetti authored
Recent versions of ifort on darwin will drop flags intended for the linker unless they are prefixed with "-Wl,". Jeff Hammond checked with the Intel compiler folks, and they confirmed that "-Wl," has been supported since the initial ifort release on OSX (9.1). Closes #2024 Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 06 Jan, 2015 1 commit
-
-
Kenneth Raffenetti authored
Previous re-organization of the library symbols resulted in a situation where Fortran programs could no longer be profiled using tools written in C. Functions in libmpifort directly called the PMPI_* versions in libmpi. Now we always call the MPI_* versions from libmpifort. In the case where we are building a separate profiling library, we use a new preprocessor flag to ensure we call PMPI_* from inside libpmpi. Additional bug fix: - always define mpi_conversion_fn_null_, there is no pmpi version Fixes #2209 Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 05 Jan, 2015 6 commits
-
-
William Gropp authored
Adds a way to pass a timelimit argument to the run command, as long as the timelimit is in seconds. This is enough for some of the MPICH versions of mpiexec and for recent versions of the Cray aprun command. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
William Gropp authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
William Gropp authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
William Gropp authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
William Gropp authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Instead of using its own versioning system that wasn't getting updated with any regularity, now the test suite will use the same versioning scheme as mainline MPICH. This is consistent with other parts of MPICH that get distributed separately (MPL, ROMIO, Hydra). Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 04 Jan, 2015 2 commits
-
-
Squashes a warning when using the embedded versions of OPA and MPL. Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
We were incorrectly adding the build directories for mpl and opa to external_ldflags in Makefile.am, causing them to be listed in the installed libmpi.la libtool file. If a linker does not handle this potentially non-existant build directory gracefully, it could cause an issue. Since the mpl and opa libraries are now embedded in libmpi by default, we simply eliminate the flags unless we are using pre-built, external libraries. Fixes #2208 Thanks to Markus Geimer for the bug report and suggested solution. Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 19 Dec, 2014 1 commit
-
-
Currently in the MPI_File_close there is a barrier in place whenever the ADIO_SHARED_FP feature is enabled AND the ADIO_UNLINK_AFTER_CLOSE feature is disabled right before the code to close the shared file pointer and potentially unlink the shared file itself. PE testing on GPFS revealed a situation using the non-collective MPI_File_read_shared/MPI_File_write_shared where based on this implementation all tasks needed to wait for all other tasks to complete processing before unlinking the shared file pointer or the open of the shared file pointer could fail. This situation is illustrated as follows with the simplest example of 2 tasks that do this: MPI_File_Open MPI_File_set_view MPI_File_Read_shared MPI_File_close So both tasks call MPI_File_Read_shared at the same time which first does the ADIO_Get_shared_fp which does the file open with create mode on the shared file pointer. Only 1 task can actually create the file, so there is a race to see who can get it done first. If task 0 gets it created then he is the winner and goes on to use it, read the file and then MPI_File_close which then unlinks the shared file pointer first and then closes the output file. Meanwhile, task 1 lost the race to create the file and is in error, the error handling in gpfs goes into effect and task 1 now just tries to open the file that task 0 created. The problem is this error handling took longer that task 0 took to read and close the output file, so at the time when task 0 does the close he is the only process with a link since task 1 is still in the create file error handlilng code so therefore gpfs goes ahead and deletes the shared file pointer. Then when the error handling code for task 1 does complete and he tries to do the open, the file is no longer there, so the open fails as does the subsequent read of the shared file pointer. Currently GPFS has the ADIO_UNLINK_AFTER_CLOSE feature enabled, so the fix for this is to remove the additional condition of ADIO_UNLINK_AFTER_CLOSE being disabled for the barrier in the close to be done. Presumably this could be an issue for any parallel file system so this change is being done in the common code. See ticket #2214 Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 18 Dec, 2014 1 commit
-
-
Kenneth Raffenetti authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 17 Dec, 2014 2 commits
-
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Junchao Zhang authored
See http://lists.mpich.org/pipermail/discuss/2014-December/003531.html Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 16 Dec, 2014 21 commits
-
-
Xin Zhao authored
when lock_epoch_count != 0, we only need to check if access_state is PER_TARGET in Win_lock. No reviewer.
-
Xin Zhao authored
When data is dropped but lock is queued, we should still store the lock entry in current request, so that we can try to acquire the lock when we received and dropped all data. No reviewer.
-
Xin Zhao authored
Here we should first dequeue the current lock queue entry from lock queue then performing the operation in it. This is because when performing op in current lock entry, we may trigger release_lock() function, which go to check the lock queue again. If we did not remove current entry from the queue, release_lock() will try to process it for the second time, which leads to the wrong execution. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
If user set no_locks to true, we do not need to allocate passive lock requests pool and lock data pool on window. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
The behavior of UNLOCK_ACK flag is exactly the same with the behavior of FLUSH_ACK, so here we just delete UNLOCK_ACK flag and use FLUSH_ACK flag for all FLUSH ACK packets. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
Add new flags for four different kinds of LOCK ACKs: (1) LOCK_GRANTED: lock is granted on target. (2) LOCK_QUEUED_DATA_QUEUED: lock is not granted on target, but it is safely queued on target. If this lock request is sent with an RMA operation, the operation data is also safely queued on target. (3) LOCK_QUEUED_DATA_DISCARDED: lock is not granted on target, but it is safely queued on target. If this lock request is sent with an RMA operation, the operation data is discarded on target due to out of resources. (4) LOCK_DISCARDED: lock is not granted on target, and it is not queued up on target due to out of resources. If this lock request is set with an RMA opration, the operation data is also discarded on target. No reviewer.
-
Xin Zhao authored
Because we will send different kinds of LOCK ACKs (not just LOCK_GRANTED, but maybe LOCK_DISCARDED, for example), so naming related packets and function as "LOCK_GRANTED" is not proper anymore. Here we rename them to "LOCK_ACK". No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
Rewrite progress engine functions as following: Basic functions: (1) check_target_state: check to see if we can switch target state, issue synchronization messages if needed. (2) issue_ops_target: issue al pending operations to this target. (3) check_window_state: check to see if we can switch window state. (4) issue_ops_win: issue all pending operations on this window. Currently it internally calls check_target_state and issue_ops_target, it should be optimized in future. Progress making functions: (1) Make_progress_target: make progress on one target, which internally call check_target_state and issue_ops_target. (2) Make_progress_win: make progress on all targets on one window, which internally call check_window_state and issue_ops_win. (3) Make_progress_global: make progress on all windows, which internally call make_progress_win. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
(1) Win_fence/Win_start: set access state right after we issue synchronization calls. (2) Win_post: set exposure state at beginning. (3) Win_wait/Win_test: set exposure state at end. (4) Win_lock/Win_lock_all: set access state at beginning. (5) Win_unlock/Win_unlock_all: set access state at end. No reviewer.
-