1. 16 Dec, 2014 16 commits
    • Xin Zhao's avatar
      Do memory barriers at proper places in RMA sync calls. · 6f8c3e59
      Xin Zhao authored
      We call memory barriers at proper places in RMA sync calls
      as following, and remove unnecessary memory barriers:
      
      (1) Win_fence: very beginning and very end.
      (2) Win_post/Win_complete: very beginning.
      (3) Win_start/Win_wait/Win_test: very end.
      (4) Win_lock/Win_lock_all: very end.
      (5) Win_unlock/Win_unlock_all: very beginning.
      (6) Win_flush/Win_flush_local/Win_flush_all/Win_flush_local_all: very beginning.
      
      About the reason of doing this, please refer to comments
      at the beginning of src/mpid/ch3/src/ch3u_rma_sync.c.
      
      No reviewer.
      6f8c3e59
    • Xin Zhao's avatar
      Poke progress engine in RMA sync call when needed · fb6a441b
      Xin Zhao authored
      In ending RMA synchronization calls, we poke the
      progress engine at last if we never poke it before.
      Because some program execution depends on the
      incoming events in progress engine, if we never
      process them we may cause deadlock in the program.
      
      No reviewer.
      fb6a441b
    • Xin Zhao's avatar
      Bug-fix: modify free_ops_before_completion function · 04d15190
      Xin Zhao authored
      Originally free_ops_before_completion functions only
      works with active target. Here we modify it to accomodate
      passive target as well.
      
      Also, everytime we trigger free_ops_before_completion,
      we lose the chance to do real Win_flush_local operation
      and must do a Win_flush instead. Here we transfer
      Win_flush_local to Win_flush if disable_flush_local flag
      is set, and unset that flag after the current flush
      is fone.
      
      No reviewer.
      04d15190
    • Xin Zhao's avatar
      097c9628
    • Xin Zhao's avatar
      Use int instead of size_t in RMA pkt header. · 3a05784f
      Xin Zhao authored
      Use int instead of size_t in RMA pkt header to reduce
      packet size.
      
      No reviewer.
      3a05784f
    • Xin Zhao's avatar
      Bug-fix: set put_acc_issued flag correctly · cc158ff2
      Xin Zhao authored
      No reviewer.
      cc158ff2
    • Xin Zhao's avatar
      Perf-optimize: avoid FLUSH/FLUSH_ACK messages if no PUT/ACC. · 2493e98b
      Xin Zhao authored
      No reviewer.
      2493e98b
    • Xin Zhao's avatar
      Bug-fix: add IMMED area in GET/GACC response packets · 87acbbbe
      Xin Zhao authored
      In this patch we allow GET/GACC response packets to
      piggyback some IMMED data, just like what we did
      for PUT/GACC/FOP/CAS packets.
      
      No reviewer.
      87acbbbe
    • Xin Zhao's avatar
      Perf-optimize: support piggybacking LOCK on large RMA operations. · 4739df59
      Xin Zhao authored
      Originally we only allows LOCK request to be piggybacked
      with small RMA operations (all data can be fit in packet
      header). This brings communication overhead for larger
      operations since origin side needs to wait for the LOCK
      ACK before it can transmit data to the target.
      
      In this patch we add support of piggybacking LOCK with
      RMA operations with arbitrary size. Note that (1) this
      only works with basic datatypes; (2) if the LOCK cannot
      be satisfied, we temporarily buffer this operation on
      the target side.
      
      No reviewer.
      4739df59
    • Xin Zhao's avatar
      c73451c0
    • Xin Zhao's avatar
      Bug-fix: handle dest==MPI_PROC_NULL in Win_flush/flush_local · e12376fd
      Xin Zhao authored
      No reviewer.
      e12376fd
    • Xin Zhao's avatar
      Bug-fix: check win_ptr->active_req_cnt in RMA sync calls · e92b7746
      Xin Zhao authored
      No reviewer.
      e92b7746
    • Xin Zhao's avatar
      Bug-fix: correctly modify win_ptr->accumulated_ops_cnt · 7b1a5e2d
      Xin Zhao authored
      accumulated_ops_cnt is used to track no. of accumulated
      posted RMA operations between two synchronization calls,
      so that we can decide when to poke progress engine based
      on the current value of this counter.
      
      Here we initialize it to zero in the BEGINNING synchronization
      calls (Win_fence, Win_start, first Win_lock, Win_lock_all),
      and correctly decrement it in the ENDING synchronization calls
      (Win_fence, Win_complete, Win_unlock, Win_unlock_all,
      Win_flush, Win_flush_local, Win_flush_all, Win_flush_local_all).
      We also use a per-target counter to track single target case.
      
      No reviewer.
      7b1a5e2d
    • Xin Zhao's avatar
      Clean up unused attributes in RMA packet structs. · b155e7e0
      Xin Zhao authored
      No reviewer.
      b155e7e0
    • Xin Zhao's avatar
      Code-refactor: arrange RMA pkt structure. · 389aab16
      Xin Zhao authored
      Arrange RMA packet definition and structures in
      src/mpid/ch3/include/mpidpkt.h in the following
      order:
      
      1. RMA operation packets: PUT, GET, ACC, GACC, CAS, FOP
      2. RMA operation response packets: GET_RESP, GACC_RESP, CAS_RESP, FOP_RESP
      3. RMA control packets: LOCK, UNLOCK, FLUSH, DECR_AT_COUNTER
      4. RMA control response packets: LOCK_ACK, FLUSH_ACK
      
      No reviewer.
      389aab16
    • Xin Zhao's avatar
      Code-refactor: arrange RMA sync functions. · a544067b
      Xin Zhao authored
      Arrange RMA sync functions in src/mpid/ch3/src/ch3u_rma_sync.c
      in the following order:
      
      Win_fence
      Win_post
      Win_start
      Win_complete
      Win_wait
      Win_test
      Win_lock
      Win_unlock
      Win_flush
      Win_flush_local
      Win_lock_all
      Win_unlock_all
      Win_flush_all
      Win_flush_local_all
      Win_sync
      
      No reviewer.
      a544067b
  2. 11 Dec, 2014 1 commit
  3. 09 Dec, 2014 2 commits
  4. 08 Dec, 2014 1 commit
  5. 05 Dec, 2014 2 commits
  6. 03 Dec, 2014 2 commits
    • Wesley Bland's avatar
      Fix typo in error code man page · 8672503d
      Wesley Bland authored
      No reviewer
      8672503d
    • James Dinan's avatar
      Fix error class buf in MPI_Error_add_code · 422b06d2
      James Dinan authored
      
      
      During error code creation, the error class was erroneously modified by
      applying ERROR_DYN_MASK when.  The dynamic bit is already set for
      user-defined error classes, so this bug had no effect in all existing
      MPICH tests.  However, when a predefined error class was passed during
      error code creation, it would be incorrectly marked as dynamic,
      resulting in an invalid result when the error class of a returned error
      code was returned via MPI_Error_class.
      Signed-off-by: default avatarWesley Bland <wbland@anl.gov>
      422b06d2
  7. 02 Dec, 2014 1 commit
  8. 28 Nov, 2014 3 commits
  9. 26 Nov, 2014 5 commits
  10. 24 Nov, 2014 3 commits
    • Paul Coffman's avatar
      romio gpfs: select correct read buffer · 230c2df3
      Paul Coffman authored and Rob Latham's avatar Rob Latham committed
      
      
      ROMIO GPFSMPIO_P2PCONTIG threaded read needs to toggle first read buffer
      
      When using both the GPFSMPIO_P2PCONTIG and GPFSMPIO_PTHREADIO
      optimizations there was a correctness bug when reading where for the
      first round the read buffer did not toggle to the two-phase buffer for
      the pthread reader, resulting in diseminating the data from the wrong
      buffer.  The fix is to do the toggle after the first read.
      Signed-off-by: default avatarPaul Coffman <pkcoff@us.ibm.com>
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
      230c2df3
    • Xin Zhao's avatar
      Bug-fix: preventing completing the same RMA request twice. · 8a0887b9
      Xin Zhao authored
      
      
      It is possible that a request handler of RMA request is
      called for the second time inside the first called request
      handler on the same request.
      
      Consider the following case: a req is queued up in Nemesis
      SHM queue with ref count of 2: one is for request completion
      and another is for dequeueing from SHM queue. The first
      called req handler completed this request and decrement ref
      count to 1. This request is still in the queue. However,
      within this handler, we trigger the same req handler on the
      same request again (for example making progress on SHM queue),
      and the second called handler also tries to complete this
      request, which leads to the wrong execution.
      
      In this patch we check if request has already been completed
      when entering the req handler, to prevent processing the same
      request twice. We also move the function finish_op_on_target()
      (where the same req handler can be triggered again)
      after request completion routine, so that we can mark the
      current request as completed before enter the same req handler
      for the second time.
      
      Fix #2204
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      8a0887b9
    • William Gropp's avatar
      Make ROMIO htmldocs update link file · e645371f
      William Gropp authored and Rob Latham's avatar Rob Latham committed
      
      
      Update the use of DOCTEXT to match the rest of MPICH, including adding
      -nolocation (drop the location of the source file from the documentation)
      and ensure that the mpi.cit file contains the I/O routines as well as
      the others (this file can be used to add links to the man pages in
      other documents).
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
      e645371f
  11. 21 Nov, 2014 2 commits
  12. 20 Nov, 2014 1 commit
  13. 19 Nov, 2014 1 commit