- 19 Dec, 2013 1 commit
-
-
Fixes #1963 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 17 Dec, 2013 1 commit
-
-
Junchao Zhang authored
Fixes #1962 Signed-off-by: Junchao Zhang<jczhang@mcs.anl.gov> (Reviewed by Bill Gropp)
-
- 15 Nov, 2013 4 commits
-
-
Xin Zhao authored
Delete code for zero-size data transfer in packet handlers of Put/Accumulate/Accumulate_Immed/Get_AccumulateResp/GetResp/ LockPutUnlock/LockAccumUnlock, because they are redundant. (Note that packet handlers of LockPutUnlock and LockAccumUnlock are for single operation optimization in passive RMA) Zero-size data transfer has already been handled when issuing RMA operations (L146, L258, L369 in src/mpid/ch3/src/ch3u_rma_ops.c and L50 in src/mpid/ch3/src/ch3u_rma_acc_ops.c). RMA operation routines will directly exit if data size is zero. Signed-off-by:
Wesley Bland <wbland@mcs.anl.gov>
-
Antonio J. Pena authored
This reverts commit 676c29f9.
-
Antonio J. Pena authored
Addresses #1932. Includes: - MPI_Bsend/MPI_Ibsend - Several collectives - Some RMA operations - MPI_Dist_graph_create Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Xin Zhao authored
MPIU_Assert at L2311 checks if rma_ops_list is empty before exiting MPIDI_Win_flush. It causes /test/mpi/threads/rma/multirma to fail because while one thread is executing the loop of poking progress engine at L2293 ~ L2302, another thread may enqueue new RMA operations to rma_ops_list. rma_ops_list has already been checked for empty before exiting MPIDI_CH3I_Do_passive_target_rma (L2724) to ensure that all enqueued operations are issued out, therefore it does not need to be checked again here. Signed-off-by:
Wesley Bland <wbland@mcs.anl.gov>
-
- 31 Oct, 2013 1 commit
-
-
Also includes random fixes to `-Wshorten-64-to-32` warnings which might need to be teased out. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 26 Oct, 2013 1 commit
-
-
To adapt to naming for control variables in MPI_T. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 26 Sep, 2013 11 commits
-
-
Pavan Balaji authored
The check was originally in the ch3 layer, but doesn't seem to use any ch3 specific information. This macro will be useful at the upper layers for optimizations, e.g., in the localcopy routine. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Pavan Balaji authored
The memory barrier ensures that all load/store operations issued directly to shared memory are complete. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Because when SHM is allocated, it is possible that orig rank and target rank are on different nodes, in such situation operations are not done yet and win_flush cannot exit. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Check shm_allocated flag in win_flush to determine if do full memory barrier or not. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
During a win_flush_all, if a target does not have any operations to flush out, don't call the win_flush function at all. This reduces the number of function calls on large systems where the RMA operations are sparsely issued. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
If SHM is allocated by MPI_Win_allocate and target is on the same node with origin, origin needs to acquire lock eagerly before it can perform any SHM RMA operations immediately on target's SHM region. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Originally for SHM RMA operations, we create strcutures to queue them up and perform them lazily when closing the epoch. Because creating queued structure causes siginificant performance overhead, we decide to not queue them up but perform them immediately. Therefore MPIDI_DO_SHM_OP macro and some special judgements on SHM operations (to count queued operations) are not needed anymore. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Change the condition of full memory barrier when closing an epoch from *judging create_flavor* to *checking if SHM is allocated*. Because condition of *SHM is allocated* means either create_flavor is SHARED or alloc_shm optimization is enabled for MPI_Win_allocate. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
We don't need the full memory barrier when opening an epoch, ordering of modifications on the same window location can be protected by the full memory barrier when closing the epoch. User can modify any window location only within an RMA epoch. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
We don't need the full memory barrier when opening an epoch, ordering of modifications on the same window location can be protected by the full memory barrier when closing the epoch. User can modify any window location only within an RMA epoch. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Do a memory barrier when winow is allocated by MPI_Win_allocate_shared, if this fence is (1) not call with MPI_MODE_NO_PROCEDE; (2) not the very first fence; (3) not following a fence with MPI_MODE_NO_SUCCEED. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 08 Aug, 2013 1 commit
-
-
Initialize "list_complete" before entering MPIDI_CH3I_RMAListPartialComplete because it is used in that function. Fixes ticket #1906. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 01 Aug, 2013 1 commit
-
-
When judging if origin and target process are on the same node, using vc->node_id flag instead of vc->ch.is_local flag. Flag 'is_local' is not correct because it is defined in nemesis, not in CH3. Flag 'node_id' is defined in CH3. Note that for ch3:sock, even if origin and target are on the same node, they are not within the same SHM region. Currently ch3:sock is filtered out by checking shm_allocated flag first. In future we need to figure out a way to check if origin and target are within the same "SHM comm". Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 28 Jul, 2013 3 commits
-
-
If "alloc_shm" is set, it may happen that the target process is doing a RMA operation from a remote process concurrently with a local process is also doing a RMA operation on the same target and on overlapping memory location. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Delete decrementing ref count in SHM RMA operations, but add conditions in operaiton issue routines. In RMA operation issue routines, judge if shm_allocate == 1 and target vc is local, if so, do not add reference count on datatypes, because they will not be referenced by the progress engine, but will be completed directly by origin. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
decrement datatype reference counts in SHM RMA operations. because they will not be referenced by the progress engine, but be completed directly by origin. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 18 Jul, 2013 2 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 17 May, 2013 1 commit
-
-
Pavan Balaji authored
All CH3 parameters start with CH3_ now. All nemesis parameters start with NEMESIS_. For netmod specific parameters, we use NEMESIS_<netmod>_. Reviewed by Charles Archer @ IBM.
-
- 14 May, 2013 1 commit
-
-
Pavan Balaji authored
No reviewer.
-
- 07 May, 2013 1 commit
-
-
James Dinan authored
Myrank was caching win_ptr->comm_ptr->rank, so we now use that directly rather than caching it in the MPID_Win object. Reviewer: balaji
-
- 06 May, 2013 1 commit
-
-
James Dinan authored
Reviewer: None
-
- 21 Feb, 2013 10 commits
-
-
James Dinan authored
I initially added a conservative flush message for empty epochs (mostly for documentation purposes). This is not needed in the current implementation, since ops are not issued eagerly. If/when eager ops are implemented, this patch should be reverted and additional window state tracking for this case should be added. In the meantime, I am removing this code to improve performance. Reviewer: goodell
-
James Dinan authored
This patch adds a few missing memory fences to the window synchronization operations for shared memory windows. This closes ticket #1729. Reviewer: goodell
-
James Dinan authored
Moved RMA errors used in the CH3 RMA implementation into the ch3 errnames.txt file. Reviewer: goodell
-
James Dinan authored
When the MPI_MODE_NOCHECK assertion is given to a passive target lock operation, we defer acquisition of the lock and piggyback the request on the first RMA op to the target. This eliminates a round-trip lock-request message. Reviewer: goodell
-
James Dinan authored
This patch adds piggybacking of flush synchronization on top of the last operation in an RMA epoch. Reviewer: goodell
-
James Dinan authored
The single_op_opt flag in the request object was previously used to track whether an operation is a lock-op-unlock type, for the purposes of completion. Tracking this state has been merged into the packet header flags, so the single_op_opt flag is no longer needed. Reviewer: goodell
-
James Dinan authored
Removed source_win_handle from the packet header, since it's no longer needed. Reviewer: goodell
-
James Dinan authored
This patch uses packet header flags to piggyback the unlock operation on other RMA operations. For most operations, there is no net change. However, FOP and GACC, unlock piggybacking was previously not implemented. Reviewer: goodell
-
James Dinan authored
This change extends RMA op processing to pass around flags as needed. It doesn't yet utilize the flags. Reviewer: goodell
-
James Dinan authored
Added flags field to RMA operation packets that are sent from origin to target. This will be used to piggyback RMA synchronization operations. Reviewer: goodell
-