1. 16 Dec, 2014 28 commits
    • Xin Zhao's avatar
      Add new pkt flags for different LOCK ACKs. · faae55ad
      Xin Zhao authored
      Add new flags for four different kinds of LOCK ACKs:
      
      (1) LOCK_GRANTED: lock is granted on target.
      (2) LOCK_QUEUED_DATA_QUEUED: lock is not granted on target,
          but it is safely queued on target. If this lock request
          is sent with an RMA operation, the operation data is also
          safely queued on target.
      (3) LOCK_QUEUED_DATA_DISCARDED: lock is not granted on target,
          but it is safely queued on target. If this lock request
          is sent with an RMA operation, the operation data is discarded
          on target due to out of resources.
      (4) LOCK_DISCARDED: lock is not granted on target, and it is
          not queued up on target due to out of resources. If this
          lock request is set with an RMA opration, the operation data
          is also discarded on target.
      
      No reviewer.
      faae55ad
    • Xin Zhao's avatar
      Change routine/pkt name from LOCK_GRANTED to LOCK_ACK · e36203c3
      Xin Zhao authored
      Because we will send different kinds of LOCK ACKs (not
      just LOCK_GRANTED, but maybe LOCK_DISCARDED, for example),
      so naming related packets and function as "LOCK_GRANTED"
      is not proper anymore. Here we rename them to "LOCK_ACK".
      
      No reviewer.
      e36203c3
    • Xin Zhao's avatar
      Bug-fix: add pkt type LOCK in GET_TARGET_WIN_HANDLE macro · 385f0aae
      Xin Zhao authored
      No reviewer.
      385f0aae
    • Xin Zhao's avatar
      Code-refactor: Move send_flush_msg function to header file. · 2b53ff69
      Xin Zhao authored
      No reviewer.
      2b53ff69
    • Xin Zhao's avatar
      Re-organize progress engine functions. · 1962d3b1
      Xin Zhao authored
      Rewrite progress engine functions as following:
      
      Basic functions:
      
      (1) check_target_state: check to see if we can switch target state,
          issue synchronization messages if needed.
      (2) issue_ops_target: issue al pending operations to this target.
      (3) check_window_state: check to see if we can switch window state.
      (4) issue_ops_win: issue all pending operations on this window.
          Currently it internally calls check_target_state and
          issue_ops_target, it should be optimized in future.
      
      Progress making functions:
      
      (1) Make_progress_target: make progress on one target, which
          internally call check_target_state and issue_ops_target.
      (2) Make_progress_win: make progress on all targets on one window,
          which internally call check_window_state and issue_ops_win.
      (3) Make_progress_global: make progress on all windows, which
          internally call make_progress_win.
      
      No reviewer.
      1962d3b1
    • Xin Zhao's avatar
      Modify struct name: replace "struct XXX" with "XXX_t" · 7c533ef3
      Xin Zhao authored
      No reviewer.
      7c533ef3
    • Xin Zhao's avatar
      54af207c
    • Xin Zhao's avatar
      Use window pool to manage lock requests · 886b1d8d
      Xin Zhao authored
      No reviewer.
      886b1d8d
    • Xin Zhao's avatar
      Set window epoch state at proper places in RMA calls. · 89d8f6c1
      Xin Zhao authored
      (1) Win_fence/Win_start: set access state right after we
          issue synchronization calls.
      (2) Win_post: set exposure state at beginning.
      (3) Win_wait/Win_test: set exposure state at end.
      (4) Win_lock/Win_lock_all: set access state at beginning.
      (5) Win_unlock/Win_unlock_all: set access state at end.
      
      No reviewer.
      89d8f6c1
    • Xin Zhao's avatar
      Reset/release window attributes in RMA sync calls. · ff6e5f9b
      Xin Zhao authored
      In Win_complete, release all requests on window; in
      Win_unlock_all, reset lock_assert on window.
      
      No reviewer.
      ff6e5f9b
    • Xin Zhao's avatar
      Bug-fix: always allocate ranks array in Win_start. · 264be641
      Xin Zhao authored
      We always need to allocate a array to store group ranks
      even for MPI_MODE_NOCHECK case, because we need always
      need that in Win_complete.
      
      No reviewer.
      264be641
    • Xin Zhao's avatar
      Check if all targets are freed at end of RMA sync calls. · 6b56d44a
      Xin Zhao authored
      For Win_fence, Win_complete and Win_unlock_all, check if
      all targets are freed at the end of function calls.
      
      No reviewer.
      6b56d44a
    • Xin Zhao's avatar
      Do memory barriers at proper places in RMA sync calls. · 6f8c3e59
      Xin Zhao authored
      We call memory barriers at proper places in RMA sync calls
      as following, and remove unnecessary memory barriers:
      
      (1) Win_fence: very beginning and very end.
      (2) Win_post/Win_complete: very beginning.
      (3) Win_start/Win_wait/Win_test: very end.
      (4) Win_lock/Win_lock_all: very end.
      (5) Win_unlock/Win_unlock_all: very beginning.
      (6) Win_flush/Win_flush_local/Win_flush_all/Win_flush_local_all: very beginning.
      
      About the reason of doing this, please refer to comments
      at the beginning of src/mpid/ch3/src/ch3u_rma_sync.c.
      
      No reviewer.
      6f8c3e59
    • Xin Zhao's avatar
      Poke progress engine in RMA sync call when needed · fb6a441b
      Xin Zhao authored
      In ending RMA synchronization calls, we poke the
      progress engine at last if we never poke it before.
      Because some program execution depends on the
      incoming events in progress engine, if we never
      process them we may cause deadlock in the program.
      
      No reviewer.
      fb6a441b
    • Xin Zhao's avatar
      Bug-fix: modify free_ops_before_completion function · 04d15190
      Xin Zhao authored
      Originally free_ops_before_completion functions only
      works with active target. Here we modify it to accomodate
      passive target as well.
      
      Also, everytime we trigger free_ops_before_completion,
      we lose the chance to do real Win_flush_local operation
      and must do a Win_flush instead. Here we transfer
      Win_flush_local to Win_flush if disable_flush_local flag
      is set, and unset that flag after the current flush
      is fone.
      
      No reviewer.
      04d15190
    • Xin Zhao's avatar
      097c9628
    • Xin Zhao's avatar
      Use int instead of size_t in RMA pkt header. · 3a05784f
      Xin Zhao authored
      Use int instead of size_t in RMA pkt header to reduce
      packet size.
      
      No reviewer.
      3a05784f
    • Xin Zhao's avatar
      Bug-fix: set put_acc_issued flag correctly · cc158ff2
      Xin Zhao authored
      No reviewer.
      cc158ff2
    • Xin Zhao's avatar
      Perf-optimize: avoid FLUSH/FLUSH_ACK messages if no PUT/ACC. · 2493e98b
      Xin Zhao authored
      No reviewer.
      2493e98b
    • Xin Zhao's avatar
      Bug-fix: add IMMED area in GET/GACC response packets · 87acbbbe
      Xin Zhao authored
      In this patch we allow GET/GACC response packets to
      piggyback some IMMED data, just like what we did
      for PUT/GACC/FOP/CAS packets.
      
      No reviewer.
      87acbbbe
    • Xin Zhao's avatar
      Perf-optimize: support piggybacking LOCK on large RMA operations. · 4739df59
      Xin Zhao authored
      Originally we only allows LOCK request to be piggybacked
      with small RMA operations (all data can be fit in packet
      header). This brings communication overhead for larger
      operations since origin side needs to wait for the LOCK
      ACK before it can transmit data to the target.
      
      In this patch we add support of piggybacking LOCK with
      RMA operations with arbitrary size. Note that (1) this
      only works with basic datatypes; (2) if the LOCK cannot
      be satisfied, we temporarily buffer this operation on
      the target side.
      
      No reviewer.
      4739df59
    • Xin Zhao's avatar
      c73451c0
    • Xin Zhao's avatar
      Bug-fix: handle dest==MPI_PROC_NULL in Win_flush/flush_local · e12376fd
      Xin Zhao authored
      No reviewer.
      e12376fd
    • Xin Zhao's avatar
      Bug-fix: check win_ptr->active_req_cnt in RMA sync calls · e92b7746
      Xin Zhao authored
      No reviewer.
      e92b7746
    • Xin Zhao's avatar
      Bug-fix: correctly modify win_ptr->accumulated_ops_cnt · 7b1a5e2d
      Xin Zhao authored
      accumulated_ops_cnt is used to track no. of accumulated
      posted RMA operations between two synchronization calls,
      so that we can decide when to poke progress engine based
      on the current value of this counter.
      
      Here we initialize it to zero in the BEGINNING synchronization
      calls (Win_fence, Win_start, first Win_lock, Win_lock_all),
      and correctly decrement it in the ENDING synchronization calls
      (Win_fence, Win_complete, Win_unlock, Win_unlock_all,
      Win_flush, Win_flush_local, Win_flush_all, Win_flush_local_all).
      We also use a per-target counter to track single target case.
      
      No reviewer.
      7b1a5e2d
    • Xin Zhao's avatar
      Clean up unused attributes in RMA packet structs. · b155e7e0
      Xin Zhao authored
      No reviewer.
      b155e7e0
    • Xin Zhao's avatar
      Code-refactor: arrange RMA pkt structure. · 389aab16
      Xin Zhao authored
      Arrange RMA packet definition and structures in
      src/mpid/ch3/include/mpidpkt.h in the following
      order:
      
      1. RMA operation packets: PUT, GET, ACC, GACC, CAS, FOP
      2. RMA operation response packets: GET_RESP, GACC_RESP, CAS_RESP, FOP_RESP
      3. RMA control packets: LOCK, UNLOCK, FLUSH, DECR_AT_COUNTER
      4. RMA control response packets: LOCK_ACK, FLUSH_ACK
      
      No reviewer.
      389aab16
    • Xin Zhao's avatar
      Code-refactor: arrange RMA sync functions. · a544067b
      Xin Zhao authored
      Arrange RMA sync functions in src/mpid/ch3/src/ch3u_rma_sync.c
      in the following order:
      
      Win_fence
      Win_post
      Win_start
      Win_complete
      Win_wait
      Win_test
      Win_lock
      Win_unlock
      Win_flush
      Win_flush_local
      Win_lock_all
      Win_unlock_all
      Win_flush_all
      Win_flush_local_all
      Win_sync
      
      No reviewer.
      a544067b
  2. 11 Dec, 2014 1 commit
  3. 09 Dec, 2014 2 commits
  4. 08 Dec, 2014 1 commit
  5. 05 Dec, 2014 3 commits
  6. 04 Dec, 2014 1 commit
    • Min Si's avatar
      Fix win size translation in attrlangf90 test. · e7e36fc7
      Min Si authored
      
      
      This test passed a 0 size to win_create which is translated to a
      integer(32bit) var by fortran compiler and passed to c mpi_win_create as
      an invalid MPI_Aint(64bit) var by fortran binding because prototype
      checking is not supported. This test can be failed if mpi_win_create
      internally initializes resource related to the value of size (i.e., mxm
      maps win buffer in win_init).
      
      This patch fixed this issue by passing a 64bit local variable as size
      parameter instead of a constant var 0 in this f90 test.
      Signed-off-by: default avatarJunchao Zhang <jczhang@mcs.anl.gov>
      e7e36fc7
  7. 03 Dec, 2014 4 commits