1. 03 Nov, 2014 2 commits
    • Xin Zhao's avatar
      Add global / local pools of RMA ops and related APIs. · fc7617f2
      Xin Zhao authored
      
      
      Instead of allocating / deallocating RMA operations whenever
      an RMA op is posted by user, we allocate fixed size operation
      pools beforehand and take the op element from those pools
      when an RMA op is posted.
      
      With only a local (per-window) op pool, the number of ops
      allocated can increase arbitrarily if many windows are created.
      Alternatively, if we only use a global op pool, other windows
      might use up all operations thus starving the window we are
      working on.
      
      In this patch we create two pools: a local (per-window) pool and a
      global pool.  Every window is guaranteed to have at least the number
      of operations in the local pool.  If we run out of these operations,
      we check in the global pool to see if we have any operations left.
      When an operation is released, it is added back to the same pool it
      was allocated from.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      fc7617f2
    • Xin Zhao's avatar
      Temporarily remove all RMA PVARs. · 5c513032
      Xin Zhao authored
      
      
      Because we are going to rewrite the RMA infrastructure
      and many PVARs will no longer be used, here we temporarily
      remove all PVARs and will add needed PVARs back after new
      implementation is done.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      5c513032
  2. 01 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Bug-fix: always waiting for remote completion in Win_unlock. · c76aa786
      Xin Zhao authored
      
      
      The original implementation includes an optimization which
      allows Win_unlock for exclusive lock to return without
      waiting for remote completion. This relys on the
      assumption that window memory on target process will not
      be accessed by a third party until that target process
      finishes all RMA operations and grants the lock to other
      processes. However, this assumption is not correct if user
      uses assert MPI_MODE_NOCHECK. Consider the following code:
      
                P0                              P1           P2
          MPI_Win_lock(P1, NULL, exclusive);
          MPI_Put(X);
          MPI_Win_unlock(P1, exclusive);
          MPI_Send (P2);                                MPI_Recv(P0);
                                                        MPI_Win_lock(P1, MODE_NOCHECK, exclusive);
                                                        MPI_Get(X);
                                                        MPI_Win_unlock(P1, exclusive);
      
      Both P0 and P2 issue exclusive lock to P1, and P2 uses assert
      MPI_MODE_NOCHECK because the lock should be granted to P2 after
      synchronization between P2 and P0. However, in the original
      implementation, GET operation on P2 might not get the updated
      value since Win_unlock on P0 return without waiting for remote
      completion.
      
      In this patch we delete this optimization. In Win_free, since every
      Win_unlock guarantees the remote completion, target process no
      longer needs to do additional counting works to detect target-side
      completion, but only needs to do a global barrier.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c76aa786
  3. 30 Oct, 2014 1 commit
  4. 01 Oct, 2014 1 commit
  5. 28 Sep, 2014 1 commit
  6. 23 Sep, 2014 1 commit
    • Xin Zhao's avatar
      Bug-fix: waiting for ACKs for Active Target Synchronization. · 74189446
      Xin Zhao authored
      
      
      The original implementation of FENCE and PSCW does not
      guarantee the remote completion of issued-out RMA operations
      when MPI_Win_complete and MPI_Win_fence returns. They only
      guarantee the local completion of issued-out operations and
      the completion of coming-in operations. This is not correct
      if we try to get updated values on target side using synchronizations
      with MPI_MODE_NOCHECK.
      
      Here we modify it by making runtime wait for ACKs from all
      targets before returning from MPI_Win_fence and MPI_Win_complete.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      74189446
  7. 31 Jul, 2014 1 commit
    • Wesley Bland's avatar
      Add MPI_Comm_revoke · 57f6ee88
      Wesley Bland authored
      
      
      MPI_Comm_revoke is a special function because it does not have a matching call
      on the "receiving side". This is because it has to act as an out-of-band,
      resilient broadcast algorithm. Because of this, in this commit, in addition to
      the usual functions to implement MPI communication calls (MPI/MPID/CH3/etc.),
      we add a new CH3 packet type that will handle revoking a communicator without
      involving a matching call from the MPI layer (similar to how RMA is currently
      implemented).
      
      The thing that must be handled most carefully when revoking a communicator is
      to ensure that a previously used context ID will eventually be returned to the
      pool of available context IDs and that after this occurs, no old messages will
      match the new usage of the context ID (for instance, if some messages are very
      slow and show up late). To accomplish this, revoke is implemented as an
      all-to-all algorithm. When one process calls revoke, it will send a message to
      all other processes in the communicator, which will trigger that process to
      send a message to all other processes, and so on. Once a process has already
      revoked its communicator locally, it won't send out another wave of messages.
      As each process receives the revoke messages from the other processes, it will
      track how many messages have been received. Once it has either received a
      revoke message or a message about a process failure for each other process, it
      will release its refcount on the communicator object. After the application
      has freed all of its references to the communicator (and all requests, files,
      etc. associated with it), the context ID will be returned to the available
      pool.
      Signed-off-by: default avatarJunchao Zhang <jczhang@mcs.anl.gov>
      57f6ee88
  8. 11 Apr, 2014 1 commit
  9. 13 Mar, 2014 1 commit
    • Huiwei Lu's avatar
      Fixes inconsistent definition of parameters · 33337436
      Huiwei Lu authored
      
      
      In MPID_Win_allocate and MPID_Win_allocate_shared, baseptr are defined
      as void * and void ** separately, while in MPIDI_Win_fns, both
      MPID_Win_allocate and MPID_Win_allocate_shared are registered as
      MPIDI_CH3U_Win_allocate, where baseptr is defined as void *.
      
      Fixes #1995
      Signed-off-by: default avatarJunchao Zhang <jczhang@mcs.anl.gov>
      33337436
  10. 17 Dec, 2013 1 commit
  11. 26 Sep, 2013 1 commit
  12. 01 Aug, 2013 3 commits
  13. 28 Jul, 2013 1 commit
    • Xin Zhao's avatar
      Add "alloc_shm" info to MPI_Win_allocate. · 384d96b7
      Xin Zhao authored
      
      
      Add "alloc_shm" to window's info arguments and initialize it to FALSE.
      In MPID_Win_allocate, if "alloc_shm" is set to true, call ALLOCATE_SHARED,
      otherwise call ALLOCATE.
      
      Free window memory only when SHM region is not allocated, therwise it is
      already freed in MPIDI_CH3I_SHM_Win_free.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      384d96b7
  14. 08 Jul, 2013 1 commit
  15. 07 May, 2013 1 commit
  16. 08 Nov, 2012 3 commits
    • James Dinan's avatar
      [svn-r10592] Updated active target to use a shared ops list · 5510107a
      James Dinan authored
      This fixes the performance regression that was introduced by concatenation of
      per-target lists.
      
      Reviewer: goodell
      5510107a
    • James Dinan's avatar
      [svn-r10590] Renamed fence_cnt to fence_issued · b054ac23
      James Dinan authored
      The fence_cnt field in MPID_Win is not a counter, it's a flag that indicates if
      fence has been called.
      
      Reviewer: buntinas
      b054ac23
    • James Dinan's avatar
      [svn-r10587] RMA epoch tracking · b001136e
      James Dinan authored
      This patch adds code to track the RMA epoch state of the local process.
      Currently, we are tracking the synchronization states that are allowed by
      MPICH; in the future, we may want to restrict this to only states that are
      allowed by the standard.  The addition of epoch tracking has several benefits:
      
       * It allows us to detect synchronization errors (implemented in this patch).
       * It allows us to implement lock_all more efficiently (implemented in this
         patch).
       * It will allow us to distinguish between active and passive target epochs and
         avoid O(p) op list concatenation (future patch).
      
      Reviewer: balaji
      b001136e
  17. 05 Nov, 2012 5 commits
    • James Dinan's avatar
      [svn-r10531] Refactored struct and enum naming to MPICH style · 7e179a85
      James Dinan authored
      Updated RMA code to remove trailing "_e" and "_s" on enum and struct type
      names to match the MPICH style.
      
      Reviewer: goodell
      7e179a85
    • James Dinan's avatar
      [svn-r10515] Implementation of passive multi-target synch · 656b26f5
      James Dinan authored
      Updated RMA implementation to track the passive target status individually, for
      each target.  Includes new implementation for lock/unlock_all.  Lock_all is
      currently unoptimized, see #1734 for future plans.
      
      Reviewer: buntinas
      656b26f5
    • James Dinan's avatar
      [svn-r10513] Support for one RMA op list per target · ab97edb7
      James Dinan authored
      The use of a dense array is a temporary measure to support the reference
      implementation.  This will be much improved by ticket #1735.
      
      Reviewer: goodell
      ab97edb7
    • James Dinan's avatar
      [svn-r10511] Removed old synch. error checking in RMA · 4bff013d
      James Dinan authored
      The old "lockRank" error checking is no longer sufficient in MPI 3.0 and must
      be removed to add support for locking multiple targets.
      
      Reviewer: balaji
      4bff013d
    • James Dinan's avatar
      [svn-r10508] Refactoring RMA Ops list to DL · cdb1b3e4
      James Dinan authored
      In this patch, I have refactored the RMA ops list again to use the MPL UTList
      doubly-linked list and to treat the list as a proper object.  This should set
      us up to work with multiple lists, as we will soon have one list per target.
      Doubly-linking the list is a big help in terms of maintainability (no more
      prevNext pointers) and flexibility (better implementation of request-based
      ops and other optimizations).
      
      Reviewer: goodell
      cdb1b3e4
  18. 25 Oct, 2012 2 commits
  19. 22 Oct, 2012 1 commit
  20. 20 Oct, 2012 1 commit
    • James Dinan's avatar
      [svn-r10423] Added passive target immediate locking · 5109ab1b
      James Dinan authored
      When enabled, this mode of operation immediately requests the lock when
      MPI_Win_lock is called.  Currently, this is enabled by setting the
      MPICH_RMA_LOCK_IMMED environment variable.  In the future, we can also make
      this mode of operation available though an info/assert.  This capability is
      needed to implement MPI-3's flush operations.
      
      Reviewer: buntinas
      5109ab1b
  21. 19 Oct, 2012 1 commit
  22. 10 Oct, 2012 1 commit
  23. 23 Aug, 2012 3 commits
    • James Dinan's avatar
      [svn-r10143] Implementation of dynamic windows. · aa8a7afb
      James Dinan authored
      This commit adds an implementation of MPI-3 dynamic windows.  This
      implementation exposes all of memory in the window, rendering attach and detach
      as no-ops.  Currently, no error checking is done to determine if RMA ops target
      valid/exposed locations at the target.  This would be a nice addition (and can
      be done at the target in the two-sided ch3 implementation), but it would incur
      a O(log(attached_segments)) performance cost.
      
      Reviewer: buntinas
      aa8a7afb
    • James Dinan's avatar
      [svn-r10142] Shared mem window: added disp_unit, fixed size=0. · 3530af43
      James Dinan authored
      Added the missing disp_unit argument (was added in a later revision of the MPI
      3.0 spec) and fixed a bug in base pointer calculations when processes pass a
      size of 0.  Added a test case to test MPI-2 ops on shared memory windows.
      
      Reviewer: buntinas
      3530af43
    • James Dinan's avatar
      [svn-r10140] Moved MPID RMA constants to RMA header file. · b79630d2
      James Dinan authored
      Moved RMA implementation constants from mpidimpl.h to the RMA implementation
      header.  Also updated constants to use enumeration types and removed an old
      fixme note, which indicated that this should be done.
      
      Reviewer: buntinas
      b79630d2
  24. 08 Aug, 2012 2 commits
    • James Dinan's avatar
      [svn-r10115] New CH3 window functions interface. · 55589398
      James Dinan authored
      This adds the win_fns table to ch3, which allows the channel to override the
      default implementation of window creation routines provided by ch3.  This also
      pushes the implementation of shared memory windows down into Nemesis, includes
      window functions for sock, and contains multiple improvements to the window
      creation functions code.
      
      Reviewer: buntinas
      55589398
    • James Dinan's avatar
      [svn-r10114] Removed old/unused RMA vtable in CH3. · 52d980d7
      James Dinan authored
      Removed the old RMA virtual function infrastructure from CH3 -- this code was
      all already dead.  Function overrides are already provided per-window in the
      MPID_Win structure.  Overrides for non-window-specific (window creation)
      operations will be added shortly.
      
      Reviewer: buntinas
      52d980d7
  25. 01 Aug, 2012 1 commit
  26. 31 Jul, 2012 1 commit
  27. 29 Jul, 2012 1 commit