1. 03 Nov, 2014 12 commits
    • Xin Zhao's avatar
      Split shared RMA packet structures. · c0094faa
      Xin Zhao authored
      
      
      Previously several RMA packet types share the same structure,
      which is misleading for coding. Here make different
      RMA packet types use different packet data structures.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c0094faa
    • Xin Zhao's avatar
      Control no. of active RMA requests in the runtime. · 257faca2
      Xin Zhao authored
      
      
      When there are too many active requests in the runtime,
      the internal memory might be used up. This patch
      prevents such situation by triggering blocking
      wait loop in operation routines when no. of active
      requests reaches certain threshold value.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      257faca2
    • Xin Zhao's avatar
      Enable making progress in operation routines. · 33d96690
      Xin Zhao authored
      
      
      We no longer use the lazy-issuing model, which delays
      all operations to the end to issue, but issues them
      as early as possible. To achieve this, we enable
      making progress in RMA routines, so that RMA operations
      can be issued out as long as synchronization is finished.
      
      Sometimes we also need to poke the progress in
      operation routines to make sure that target side
      makes enough progress to receiving packets. Here
      we trigger it when no. of posted operations reaches
      certain threshold value.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      33d96690
    • Xin Zhao's avatar
      Implement GET_OP routine which guarantees to return an OP. · 5dd55154
      Xin Zhao authored
      
      
      GET_OP function may be a blocking function which guarantees
      to return an RMA operation.
      
      Inside GET_OP we first call the normal OP_ALLOC function
      which will try to get a new OP from OP pools; if failed,
      we call nonblocking GC function to cleanup completed ops
      and then call OP_ALLOC again; if we still cannot get a
      new OP, we call nonblocking FREE_OP_BEFORE_COMPLETION
      function if hardware ordering is provided and then call
      OP_ALLOC again; if still failed, finally we call blocking
      aggressive cleanup function, which will guarantee to
      return a new OP element.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      5dd55154
    • Xin Zhao's avatar
      Add blocking ops / targets aggressively cleanup functions. · 41a365ec
      Xin Zhao authored
      
      
      When we run out of resources for operations and targets,
      we need to make the runtime to complete some operations
      so that it can free some resources.
      
      For RMA operations, we implement by doing an internal
      FLUSH_LOCAL for one target and waiting for operation
      resources; for RMA targets, we implement by doing an
      internal FLUSH operation for one target and wait for
      target resources.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      41a365ec
    • Xin Zhao's avatar
      Add new RMA states on window / target and modify state checking. · f076f3fe
      Xin Zhao authored
      
      
      We define new states to indicate the current situation of
      RMA synchronization. The states contain both ACCESS states
      and EXPOPSURE states, and specify if the synchronization
      is initialized (_CALLED), on-going (_ISSUED) and completed
      (_GRANTED). For single lock in Passive Target, we use
      per-target state whereas the window state is set to PER_TARGET.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      f076f3fe
    • Xin Zhao's avatar
      Add a flag in op struct to indicate derived datatype. · 7eac974f
      Xin Zhao authored
      
      
      Add flag is_dt in op structure which is set when any
      buffers involved in RMA operations contains derived
      datatype data. It is convenient for us to enqueue
      issued but not completed operation to the DT specific
      list.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      7eac974f
    • Xin Zhao's avatar
      Add routine to enqueue op to RMA slots. · 079a516b
      Xin Zhao authored
      
      
      Given an RMA op, finding the correct slot and target,
      enqueue op to the pending op list in that target object.
      If the target is not existed, create one in that slot.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      079a516b
    • Xin Zhao's avatar
      Add global / local pools of RMA ops and related APIs. · fc7617f2
      Xin Zhao authored
      
      
      Instead of allocating / deallocating RMA operations whenever
      an RMA op is posted by user, we allocate fixed size operation
      pools beforehand and take the op element from those pools
      when an RMA op is posted.
      
      With only a local (per-window) op pool, the number of ops
      allocated can increase arbitrarily if many windows are created.
      Alternatively, if we only use a global op pool, other windows
      might use up all operations thus starving the window we are
      working on.
      
      In this patch we create two pools: a local (per-window) pool and a
      global pool.  Every window is guaranteed to have at least the number
      of operations in the local pool.  If we run out of these operations,
      we check in the global pool to see if we have any operations left.
      When an operation is released, it is added back to the same pool it
      was allocated from.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      fc7617f2
    • Xin Zhao's avatar
      Embedding packet structure into RMA operation structure. · b1685139
      Xin Zhao authored
      
      
      We were duplicating information in the operation structure and in the
      packet structure when the message is actually issued.  Since most of
      the information is the same anyway, this patch just embeds a packet
      structure into the operation structure, so that we eliminate unnessary
      copy.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      b1685139
    • Xin Zhao's avatar
      Code refactoring to clean up the RMA code. · 61f952c7
      Xin Zhao authored
      
      
      Split RMA functionality into smaller files, and move functions
      to where they belong based on the file names.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      61f952c7
    • Xin Zhao's avatar
      Temporarily remove all RMA PVARs. · 5c513032
      Xin Zhao authored
      
      
      Because we are going to rewrite the RMA infrastructure
      and many PVARs will no longer be used, here we temporarily
      remove all PVARs and will add needed PVARs back after new
      implementation is done.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      5c513032
  2. 01 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Bug-fix: always waiting for remote completion in Win_unlock. · c76aa786
      Xin Zhao authored
      
      
      The original implementation includes an optimization which
      allows Win_unlock for exclusive lock to return without
      waiting for remote completion. This relys on the
      assumption that window memory on target process will not
      be accessed by a third party until that target process
      finishes all RMA operations and grants the lock to other
      processes. However, this assumption is not correct if user
      uses assert MPI_MODE_NOCHECK. Consider the following code:
      
                P0                              P1           P2
          MPI_Win_lock(P1, NULL, exclusive);
          MPI_Put(X);
          MPI_Win_unlock(P1, exclusive);
          MPI_Send (P2);                                MPI_Recv(P0);
                                                        MPI_Win_lock(P1, MODE_NOCHECK, exclusive);
                                                        MPI_Get(X);
                                                        MPI_Win_unlock(P1, exclusive);
      
      Both P0 and P2 issue exclusive lock to P1, and P2 uses assert
      MPI_MODE_NOCHECK because the lock should be granted to P2 after
      synchronization between P2 and P0. However, in the original
      implementation, GET operation on P2 might not get the updated
      value since Win_unlock on P0 return without waiting for remote
      completion.
      
      In this patch we delete this optimization. In Win_free, since every
      Win_unlock guarantees the remote completion, target process no
      longer needs to do additional counting works to detect target-side
      completion, but only needs to do a global barrier.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c76aa786
  3. 30 Oct, 2014 1 commit
  4. 18 Jul, 2014 1 commit
  5. 17 Jul, 2014 1 commit
    • Pavan Balaji's avatar
      Simplified RMA_Op structure. · 274a5a70
      Pavan Balaji authored
      
      
      We were creating duplicating information in the operation structure
      and in the packet structure when the message is actually issued.
      Since most of the information is the same anyway, this patch just
      embeds a packet structure into the operation structure.
      Signed-off-by: default avatarXin Zhao <xinzhao3@illinois.edu>
      274a5a70
  6. 11 Apr, 2014 2 commits
  7. 30 Dec, 2013 1 commit
    • Antonio J. Pena's avatar
      Fix warnings in ch3u_rma_acc_ops and ch3u_rma_ops · 583e3f0a
      Antonio J. Pena authored
      
      
      Fixes the following warnings (with --enable-strict):
      
      src/mpid/ch3/src/ch3u_rma_acc_ops.c: In function 'MPIDI_Get_accumulate':
      src/mpid/ch3/src/ch3u_rma_acc_ops.c:31:5: warning: unused variable
      'mpiu_chklmem_stk_sz_' [-Wunused-variable]
      
      src/mpid/ch3/src/ch3u_rma_ops.c: In function 'MPIDI_Accumulate':
      src/mpid/ch3/src/ch3u_rma_ops.c:350:5: warning: unused variable
      'mpiu_chklmem_stk_sz_' [-Wunused-variable]
      
      See ticket #1966
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      583e3f0a
  8. 17 Dec, 2013 1 commit
  9. 26 Sep, 2013 3 commits
  10. 01 Aug, 2013 5 commits
  11. 28 Jul, 2013 4 commits
  12. 07 May, 2013 1 commit
  13. 01 Mar, 2013 1 commit
  14. 22 Feb, 2013 1 commit
    • James Dinan's avatar
      CH3 default shared memory window implementation · 8cbf6414
      James Dinan authored
      This adds a default shared memory window implementation for CH3 (used
      by e.g. sock), which works only for MPI_COMM_SELF (this is what the
      default comm_split_type provides).  This closes ticket #1666.
      
      Reviewer: apenya
      8cbf6414
  15. 11 Jan, 2013 1 commit
    • James Dinan's avatar
      Implemented interprocess shared memory RMA ops · 58ec39c5
      James Dinan authored
      Communication operations on shared memory windows now perform the op directly
      on the shared buffer.  This requried the addition of a per-window interprocess
      mutex to ensure that atomics and accumulates are performed atomically.
      
      Reviewer: buntinas
      58ec39c5
  16. 08 Nov, 2012 3 commits
    • James Dinan's avatar
      [svn-r10592] Updated active target to use a shared ops list · 5510107a
      James Dinan authored
      This fixes the performance regression that was introduced by concatenation of
      per-target lists.
      
      Reviewer: goodell
      5510107a
    • James Dinan's avatar
      [svn-r10590] Renamed fence_cnt to fence_issued · b054ac23
      James Dinan authored
      The fence_cnt field in MPID_Win is not a counter, it's a flag that indicates if
      fence has been called.
      
      Reviewer: buntinas
      b054ac23
    • James Dinan's avatar
      [svn-r10587] RMA epoch tracking · b001136e
      James Dinan authored
      This patch adds code to track the RMA epoch state of the local process.
      Currently, we are tracking the synchronization states that are allowed by
      MPICH; in the future, we may want to restrict this to only states that are
      allowed by the standard.  The addition of epoch tracking has several benefits:
      
       * It allows us to detect synchronization errors (implemented in this patch).
       * It allows us to implement lock_all more efficiently (implemented in this
         patch).
       * It will allow us to distinguish between active and passive target epochs and
         avoid O(p) op list concatenation (future patch).
      
      Reviewer: balaji
      b001136e
  17. 05 Nov, 2012 1 commit