- 26 Jun, 2015 9 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
After Win_flush_local/Win_flush_local_all/Win_flush/Win_flush_all, we should set upgrade_flush_local flag back to 0. Originally we forgot to do this in Win_flush/Win_flush_all. Here we add them. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Originally in the RMA synchronization, we always try to piggyback LOCK/UNLOCK/FLUSH flags with operations by delaying issuing some of the operations. This is good when number of operations is very small, but delaying issuing not good when message size is large or number of operations is large. In this patch, we add an CVAR to control turn on/off piggybacking LOCK/UNLOCK/FLUSH flags. Defaultly it is off, which means we only piggyback when there are operations available, but not at the cost of delaying issuing operations. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Since it does not help on performance. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Originally we poke the progress engine at the end of RMA sync calls if progress engine is never poked in this call before. The purpose of this is to prevent possible deadlock problem. However, the deadlock problem should only happen in self lock cases, if target is not myself, it add unnecessary overhead to RMA sync calls. In this patch, we delete those progress poking but only leave ones when target is myself. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
In this patch, we add a reduce-scatter based algorithm in MPI_Win_fence, which is triggered when number of processes is at a small / medium value. When this algorithm is being used, memory usage is O(P), but the ending FENCE only needs to wait for local completion but does not need to wait for remote completion. When number of processes is large, we switch FENCE to the original barrier based algorithm, which has O(1) memory usage, but needs to wait for the remote completion in the ending FENCE. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 24 Jun, 2015 1 commit
-
-
Xin Zhao authored
The original implementation in Win_flush_local counts number of total local completion and remote completion needed to wait, and then waiting for current local/remote completion count to reach those values. There is a bug that we should initialize the current count to zero in each while loop, otherwise the targets that are already completed will be count again and we failed to wait for some targets to be completed. This patch fixes this issue. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 14 Jun, 2015 1 commit
-
-
The outstanding_acks counter was increased at each sync call (such as fence and flush). However, the counter had to be decreased again if flush ack is not required. It is more straightforward if increasing it only when the flush packet is issued (FLUSH flag piggyback or a separate flush message). Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 12 Jun, 2015 6 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
In this patch, we move the judgement of local/remote completion out of GC function to separate macros, so that GC function only does the garbage collection work. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rename "disable_flush_local" to "upgrade_flush_local", which indicates that we upgrade FLUSH_LOCAL to FLUSH. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
have_remote_incomplete_ops is not actually used in the code, we remove it here. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 03 Mar, 2015 1 commit
-
-
Xin Zhao authored
No reviewer.
-
- 16 Dec, 2014 12 commits
-
-
Xin Zhao authored
when lock_epoch_count != 0, we only need to check if access_state is PER_TARGET in Win_lock. No reviewer.
-
Xin Zhao authored
(1) Win_fence/Win_start: set access state right after we issue synchronization calls. (2) Win_post: set exposure state at beginning. (3) Win_wait/Win_test: set exposure state at end. (4) Win_lock/Win_lock_all: set access state at beginning. (5) Win_unlock/Win_unlock_all: set access state at end. No reviewer.
-
Xin Zhao authored
In Win_complete, release all requests on window; in Win_unlock_all, reset lock_assert on window. No reviewer.
-
Xin Zhao authored
We always need to allocate a array to store group ranks even for MPI_MODE_NOCHECK case, because we need always need that in Win_complete. No reviewer.
-
Xin Zhao authored
For Win_fence, Win_complete and Win_unlock_all, check if all targets are freed at the end of function calls. No reviewer.
-
Xin Zhao authored
We call memory barriers at proper places in RMA sync calls as following, and remove unnecessary memory barriers: (1) Win_fence: very beginning and very end. (2) Win_post/Win_complete: very beginning. (3) Win_start/Win_wait/Win_test: very end. (4) Win_lock/Win_lock_all: very end. (5) Win_unlock/Win_unlock_all: very beginning. (6) Win_flush/Win_flush_local/Win_flush_all/Win_flush_local_all: very beginning. About the reason of doing this, please refer to comments at the beginning of src/mpid/ch3/src/ch3u_rma_sync.c. No reviewer.
-
Xin Zhao authored
In ending RMA synchronization calls, we poke the progress engine at last if we never poke it before. Because some program execution depends on the incoming events in progress engine, if we never process them we may cause deadlock in the program. No reviewer.
-
Xin Zhao authored
Originally free_ops_before_completion functions only works with active target. Here we modify it to accomodate passive target as well. Also, everytime we trigger free_ops_before_completion, we lose the chance to do real Win_flush_local operation and must do a Win_flush instead. Here we transfer Win_flush_local to Win_flush if disable_flush_local flag is set, and unset that flag after the current flush is fone. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
accumulated_ops_cnt is used to track no. of accumulated posted RMA operations between two synchronization calls, so that we can decide when to poke progress engine based on the current value of this counter. Here we initialize it to zero in the BEGINNING synchronization calls (Win_fence, Win_start, first Win_lock, Win_lock_all), and correctly decrement it in the ENDING synchronization calls (Win_fence, Win_complete, Win_unlock, Win_unlock_all, Win_flush, Win_flush_local, Win_flush_all, Win_flush_local_all). We also use a per-target counter to track single target case. No reviewer.
-
Xin Zhao authored
Arrange RMA sync functions in src/mpid/ch3/src/ch3u_rma_sync.c in the following order: Win_fence Win_post Win_start Win_complete Win_wait Win_test Win_lock Win_unlock Win_flush Win_flush_local Win_lock_all Win_unlock_all Win_flush_all Win_flush_local_all Win_sync No reviewer.
-
- 13 Nov, 2014 1 commit
-
-
Xin Zhao authored
When operation pending list and request lists are all empty, FLUSH message needs to be sent by origin only when origin issued PUT/ACC operations since the last synchronization calls, otherwise origin does not need to issue FLUSH at all and does not need to wait for FLUSH ACK message. Similiarly, origin waits for ACK of UNLOCK message only when origin issued PUT/ACC operations since the last synchronization calls. However, UNLOCK message always needs to be sent out because origin needs to unlock the target process. This patch avoids issuing unnecessary FLUSH / FLUSH ACK / UNLOCK ACK messages. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 12 Nov, 2014 1 commit
-
-
Wesley Bland authored
The errflag value being used in the MPIC helper functions only propagated whether or not an error occurred. It did not contain any information about what kind of error occurred, which made returning the correct error code after a process failure impossible. This patch converts the binary value to an enum with three options: MPIR_ERR_NONE MPIR_ERR_PROC_FAILED MPIR_ERR_OTHER The original use of TRUE and false maps to MPIR_ERR_NONE and MPIR_ERR_OTHER. MPIR_ERR_PROC_FAILED indicates that the error occurred because of a process failure. It uses the new bit set aside from the tag space to track such information between processes. This change required modifying lots of function signatures and type declarations to use the new enum type, but these are actually not very intrusive changes and shouldn't be a problem going forward. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 11 Nov, 2014 2 commits
-
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 03 Nov, 2014 6 commits
-
-
Xin Zhao authored
Add some original RMA PVARs back to the new RMA infrastructure, including timing of packet handlers, op allocation and setting, window creation, etc. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
We made a huge change to RMA infrastructure and a lot of old code can be droped, including separate handlers for lock-op-unlock, ACCUM_IMMED specific code, O(p) data structure code, code of lazy issuing, etc. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
We use new algorithms for RMA synchronization functions and RMA epochs. The old implementation uses a lazy-issuing algorithm, which queues up all operations and issues them at end. This forbid opportunites to do hardware RMA operations and can use up all memory resources when we queue up large number of operations. Here we use a new algorithm, which will initialize the synchonization at beginning, and issue operations as soon as the synchronization is finished. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
We define new states to indicate the current situation of RMA synchronization. The states contain both ACCESS states and EXPOPSURE states, and specify if the synchronization is initialized (_CALLED), on-going (_ISSUED) and completed (_GRANTED). For single lock in Passive Target, we use per-target state whereas the window state is set to PER_TARGET. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Instead of allocating / deallocating RMA operations whenever an RMA op is posted by user, we allocate fixed size operation pools beforehand and take the op element from those pools when an RMA op is posted. With only a local (per-window) op pool, the number of ops allocated can increase arbitrarily if many windows are created. Alternatively, if we only use a global op pool, other windows might use up all operations thus starving the window we are working on. In this patch we create two pools: a local (per-window) pool and a global pool. Every window is guaranteed to have at least the number of operations in the local pool. If we run out of these operations, we check in the global pool to see if we have any operations left. When an operation is released, it is added back to the same pool it was allocated from. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
We were duplicating information in the operation structure and in the packet structure when the message is actually issued. Since most of the information is the same anyway, this patch just embeds a packet structure into the operation structure, so that we eliminate unnessary copy. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-