1. 26 Jun, 2015 1 commit
  2. 24 Jun, 2015 1 commit
    • Xin Zhao's avatar
      Bug-fix: correct local/remote completion in Win_flush_local. · 45dfd815
      Xin Zhao authored
      
      
      The original implementation in Win_flush_local counts number
      of total local completion and remote completion needed to
      wait, and then waiting for current local/remote completion
      count to reach those values. There is a bug that we should
      initialize the current count to zero in each while loop,
      otherwise the targets that are already completed will be
      count again and we failed to wait for some targets to be
      completed. This patch fixes this issue.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      45dfd815
  3. 14 Jun, 2015 1 commit
  4. 12 Jun, 2015 6 commits
  5. 03 Mar, 2015 1 commit
  6. 16 Dec, 2014 12 commits
    • Xin Zhao's avatar
      Simplify epoch checking in Win_lock · 52531f77
      Xin Zhao authored
      when lock_epoch_count != 0, we only need to check if access_state
      is PER_TARGET in Win_lock.
      
      No reviewer.
      52531f77
    • Xin Zhao's avatar
      Set window epoch state at proper places in RMA calls. · 89d8f6c1
      Xin Zhao authored
      (1) Win_fence/Win_start: set access state right after we
          issue synchronization calls.
      (2) Win_post: set exposure state at beginning.
      (3) Win_wait/Win_test: set exposure state at end.
      (4) Win_lock/Win_lock_all: set access state at beginning.
      (5) Win_unlock/Win_unlock_all: set access state at end.
      
      No reviewer.
      89d8f6c1
    • Xin Zhao's avatar
      Reset/release window attributes in RMA sync calls. · ff6e5f9b
      Xin Zhao authored
      In Win_complete, release all requests on window; in
      Win_unlock_all, reset lock_assert on window.
      
      No reviewer.
      ff6e5f9b
    • Xin Zhao's avatar
      Bug-fix: always allocate ranks array in Win_start. · 264be641
      Xin Zhao authored
      We always need to allocate a array to store group ranks
      even for MPI_MODE_NOCHECK case, because we need always
      need that in Win_complete.
      
      No reviewer.
      264be641
    • Xin Zhao's avatar
      Check if all targets are freed at end of RMA sync calls. · 6b56d44a
      Xin Zhao authored
      For Win_fence, Win_complete and Win_unlock_all, check if
      all targets are freed at the end of function calls.
      
      No reviewer.
      6b56d44a
    • Xin Zhao's avatar
      Do memory barriers at proper places in RMA sync calls. · 6f8c3e59
      Xin Zhao authored
      We call memory barriers at proper places in RMA sync calls
      as following, and remove unnecessary memory barriers:
      
      (1) Win_fence: very beginning and very end.
      (2) Win_post/Win_complete: very beginning.
      (3) Win_start/Win_wait/Win_test: very end.
      (4) Win_lock/Win_lock_all: very end.
      (5) Win_unlock/Win_unlock_all: very beginning.
      (6) Win_flush/Win_flush_local/Win_flush_all/Win_flush_local_all: very beginning.
      
      About the reason of doing this, please refer to comments
      at the beginning of src/mpid/ch3/src/ch3u_rma_sync.c.
      
      No reviewer.
      6f8c3e59
    • Xin Zhao's avatar
      Poke progress engine in RMA sync call when needed · fb6a441b
      Xin Zhao authored
      In ending RMA synchronization calls, we poke the
      progress engine at last if we never poke it before.
      Because some program execution depends on the
      incoming events in progress engine, if we never
      process them we may cause deadlock in the program.
      
      No reviewer.
      fb6a441b
    • Xin Zhao's avatar
      Bug-fix: modify free_ops_before_completion function · 04d15190
      Xin Zhao authored
      Originally free_ops_before_completion functions only
      works with active target. Here we modify it to accomodate
      passive target as well.
      
      Also, everytime we trigger free_ops_before_completion,
      we lose the chance to do real Win_flush_local operation
      and must do a Win_flush instead. Here we transfer
      Win_flush_local to Win_flush if disable_flush_local flag
      is set, and unset that flag after the current flush
      is fone.
      
      No reviewer.
      04d15190
    • Xin Zhao's avatar
      Bug-fix: handle dest==MPI_PROC_NULL in Win_flush/flush_local · e12376fd
      Xin Zhao authored
      No reviewer.
      e12376fd
    • Xin Zhao's avatar
      Bug-fix: check win_ptr->active_req_cnt in RMA sync calls · e92b7746
      Xin Zhao authored
      No reviewer.
      e92b7746
    • Xin Zhao's avatar
      Bug-fix: correctly modify win_ptr->accumulated_ops_cnt · 7b1a5e2d
      Xin Zhao authored
      accumulated_ops_cnt is used to track no. of accumulated
      posted RMA operations between two synchronization calls,
      so that we can decide when to poke progress engine based
      on the current value of this counter.
      
      Here we initialize it to zero in the BEGINNING synchronization
      calls (Win_fence, Win_start, first Win_lock, Win_lock_all),
      and correctly decrement it in the ENDING synchronization calls
      (Win_fence, Win_complete, Win_unlock, Win_unlock_all,
      Win_flush, Win_flush_local, Win_flush_all, Win_flush_local_all).
      We also use a per-target counter to track single target case.
      
      No reviewer.
      7b1a5e2d
    • Xin Zhao's avatar
      Code-refactor: arrange RMA sync functions. · a544067b
      Xin Zhao authored
      Arrange RMA sync functions in src/mpid/ch3/src/ch3u_rma_sync.c
      in the following order:
      
      Win_fence
      Win_post
      Win_start
      Win_complete
      Win_wait
      Win_test
      Win_lock
      Win_unlock
      Win_flush
      Win_flush_local
      Win_lock_all
      Win_unlock_all
      Win_flush_all
      Win_flush_local_all
      Win_sync
      
      No reviewer.
      a544067b
  7. 13 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Perf-tuning: issue FLUSH, FLUSH ACK, UNLOCK ACK messages only when needed. · a9d968cc
      Xin Zhao authored
      
      
      When operation pending list and request lists are all empty, FLUSH message
      needs to be sent by origin only when origin issued PUT/ACC operations since
      the last synchronization calls, otherwise origin does not need to issue FLUSH
      at all and does not need to wait for FLUSH ACK message.
      
      Similiarly, origin waits for ACK of UNLOCK message only when origin issued
      PUT/ACC operations since the last synchronization calls. However, UNLOCK
      message always needs to be sent out because origin needs to unlock the
      target process. This patch avoids issuing unnecessary
      FLUSH / FLUSH ACK / UNLOCK ACK messages.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      a9d968cc
  8. 12 Nov, 2014 1 commit
    • Wesley Bland's avatar
      Change errflag to be an enum · 3850e6bf
      Wesley Bland authored
      
      
      The errflag value being used in the MPIC helper functions only
      propagated whether or not an error occurred. It did not contain any
      information about what kind of error occurred, which made returning the
      correct error code after a process failure impossible.
      
      This patch converts the binary value to an enum with three options:
      MPIR_ERR_NONE
      MPIR_ERR_PROC_FAILED
      MPIR_ERR_OTHER
      
      The original use of TRUE and false maps to MPIR_ERR_NONE and
      MPIR_ERR_OTHER.
      
      MPIR_ERR_PROC_FAILED indicates that the error occurred
      because of a process failure. It uses the new bit set aside from the tag
      space to track such information between processes.
      
      This change required modifying lots of function signatures and type
      declarations to use the new enum type, but these are actually not very
      intrusive changes and shouldn't be a problem going forward.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      3850e6bf
  9. 11 Nov, 2014 2 commits
  10. 03 Nov, 2014 9 commits
    • Xin Zhao's avatar
      add original RMA PVARs back. · ed20cd37
      Xin Zhao authored
      
      
      Add some original RMA PVARs back to the new
      RMA infrastructure, including timing of packet
      handlers, op allocation and setting, window
      creation, etc.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      ed20cd37
    • Xin Zhao's avatar
      Delete no longer needed code. · cc63b367
      Xin Zhao authored
      
      
      We made a huge change to RMA infrastructure and
      a lot of old code can be droped, including separate
      handlers for lock-op-unlock, ACCUM_IMMED specific
      code, O(p) data structure code, code of lazy issuing,
      etc.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      cc63b367
    • Xin Zhao's avatar
      Rewrite all synchronization routines. · 38b20e57
      Xin Zhao authored
      
      
      We use new algorithms for RMA synchronization
      functions and RMA epochs. The old implementation
      uses a lazy-issuing algorithm, which queues up
      all operations and issues them at end. This
      forbid opportunites to do hardware RMA operations
      and can use up all memory resources when we
      queue up large number of operations.
      
      Here we use a new algorithm, which will initialize
      the synchonization at beginning, and issue operations
      as soon as the synchronization is finished.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      38b20e57
    • Xin Zhao's avatar
      Add new RMA states on window / target and modify state checking. · f076f3fe
      Xin Zhao authored
      
      
      We define new states to indicate the current situation of
      RMA synchronization. The states contain both ACCESS states
      and EXPOPSURE states, and specify if the synchronization
      is initialized (_CALLED), on-going (_ISSUED) and completed
      (_GRANTED). For single lock in Passive Target, we use
      per-target state whereas the window state is set to PER_TARGET.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      f076f3fe
    • Xin Zhao's avatar
      Add global / local pools of RMA ops and related APIs. · fc7617f2
      Xin Zhao authored
      
      
      Instead of allocating / deallocating RMA operations whenever
      an RMA op is posted by user, we allocate fixed size operation
      pools beforehand and take the op element from those pools
      when an RMA op is posted.
      
      With only a local (per-window) op pool, the number of ops
      allocated can increase arbitrarily if many windows are created.
      Alternatively, if we only use a global op pool, other windows
      might use up all operations thus starving the window we are
      working on.
      
      In this patch we create two pools: a local (per-window) pool and a
      global pool.  Every window is guaranteed to have at least the number
      of operations in the local pool.  If we run out of these operations,
      we check in the global pool to see if we have any operations left.
      When an operation is released, it is added back to the same pool it
      was allocated from.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      fc7617f2
    • Xin Zhao's avatar
      Embedding packet structure into RMA operation structure. · b1685139
      Xin Zhao authored
      
      
      We were duplicating information in the operation structure and in the
      packet structure when the message is actually issued.  Since most of
      the information is the same anyway, this patch just embeds a packet
      structure into the operation structure, so that we eliminate unnessary
      copy.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      b1685139
    • Xin Zhao's avatar
      Avoid using VC in RMA lock queue structure. · 0eaf344b
      Xin Zhao authored
      
      
      We were adding an unnecessary dependency on VC structure
      declarations in the mpidpkt.h file. The required information
      in RMA lock queue is only the rank, but not actual VC.
      Here we replace VC with rank.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      0eaf344b
    • Xin Zhao's avatar
      Code refactoring to clean up the RMA code. · 61f952c7
      Xin Zhao authored
      
      
      Split RMA functionality into smaller files, and move functions
      to where they belong based on the file names.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      61f952c7
    • Xin Zhao's avatar
      Temporarily remove all RMA PVARs. · 5c513032
      Xin Zhao authored
      
      
      Because we are going to rewrite the RMA infrastructure
      and many PVARs will no longer be used, here we temporarily
      remove all PVARs and will add needed PVARs back after new
      implementation is done.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      5c513032
  11. 01 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Bug-fix: always waiting for remote completion in Win_unlock. · c76aa786
      Xin Zhao authored
      
      
      The original implementation includes an optimization which
      allows Win_unlock for exclusive lock to return without
      waiting for remote completion. This relys on the
      assumption that window memory on target process will not
      be accessed by a third party until that target process
      finishes all RMA operations and grants the lock to other
      processes. However, this assumption is not correct if user
      uses assert MPI_MODE_NOCHECK. Consider the following code:
      
                P0                              P1           P2
          MPI_Win_lock(P1, NULL, exclusive);
          MPI_Put(X);
          MPI_Win_unlock(P1, exclusive);
          MPI_Send (P2);                                MPI_Recv(P0);
                                                        MPI_Win_lock(P1, MODE_NOCHECK, exclusive);
                                                        MPI_Get(X);
                                                        MPI_Win_unlock(P1, exclusive);
      
      Both P0 and P2 issue exclusive lock to P1, and P2 uses assert
      MPI_MODE_NOCHECK because the lock should be granted to P2 after
      synchronization between P2 and P0. However, in the original
      implementation, GET operation on P2 might not get the updated
      value since Win_unlock on P0 return without waiting for remote
      completion.
      
      In this patch we delete this optimization. In Win_free, since every
      Win_unlock guarantees the remote completion, target process no
      longer needs to do additional counting works to detect target-side
      completion, but only needs to do a global barrier.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c76aa786
  12. 30 Oct, 2014 2 commits
    • Xin Zhao's avatar
      Clean up white-space and code format in RMA code. · fe283e91
      Xin Zhao authored
      No reviewer.
      fe283e91
    • Min Si's avatar
      Bug-fix: trigger final req handler for receiving derived datatype. · 920661c3
      Min Si authored
      
      
      There are two request handlers used when receiving data:
      (1) OnDataAvail, which is triggered when data is arrived;
      (2) OnFinal, which is triggered when receiving data is finished;
      
      When receiving large derived datatype, the receiving iov can be divided
      into multiple iovs. The OnDataAvail handler is set to iov load function
      when still waiting for remaining data. However, such handler should be
      set to OnFinal when starting receiving the last iov.
      
      The original code does not set OnDataAvail handler to OnFinal at end.
      This patch fixes this bug.
      
      Note that this bug only appears in RMA calls, because only the RMA
      packet handers need to specify OnFinal.
      
      Resolve #2189.
      Signed-off-by: default avatarXin Zhao <xinzhao3@illinois.edu>
      920661c3
  13. 01 Oct, 2014 2 commits