1. 20 Apr, 2015 3 commits
  2. 04 Mar, 2015 9 commits
    • Xin Zhao's avatar
      Rename predefined_type / predef_type to basic_type. · 04deb880
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      In MPI standard, predefined datatype is called as basic type.
      It is better to make the name same with the standard in the
      code.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      04deb880
    • Xin Zhao's avatar
      Rename eltype, n_elements and element_size to better names. · 98c76f78
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      98c76f78
    • Xin Zhao's avatar
      Modify RMA pkt handlers and req handlers to allow for stream units. · efad963a
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      On target side, we always allocate a SRBuf with 256K, which
      equals to the size of stream unit, to receive ACC/GACC data.
      
      Note that in MPIDI_CH3U_Request_load_recv_iov(), for ACC/GACC
      operations, since we already use SRBuf to receive the data
      at beginning, we will not use another SRBuf here, in order
      to avoid one more memory copy.
      
      Also, we pass the stream_offset in the current RMA packet to
      the request struct (when receiving is not finished) and
      do_accumulate_op function (when receiving is finished).
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      efad963a
    • Xin Zhao's avatar
      Bug-fix: add FOP req types. · c9435750
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      This patch adds req types for FOP operation, and calls FOP req handler
      after SRBuf is unpacked.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c9435750
    • Xin Zhao's avatar
      Correct the name of RMA requests types. · f75eb4eb
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      f75eb4eb
    • Xin Zhao's avatar
      Code-refactoring: make perform_get_acc_in_lock_queue cleaner. · a3af53c3
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      This patch does not change any functionality but just makes the
      code structure cleaner.
      
      The original code structure of perform_get_acc_in_lock_queue is
      a mess since the code of dealing with IMMED packet type and the
      code of dealing with normal packet type are mixed together.
      This patch separates these two parts and makes the function looks
      cleaner.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      a3af53c3
    • Xin Zhao's avatar
      Bug-fix: make RMA work correctly with pair basic type. · ce8bc310
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      The original implementation of RMA does not consider pair basic
      types (e.g. MPI_FLOAT_INT, MPI_DOUBLE_INT). It only
      works correctly with builtin datatypes (e.g. MPI_INT, MPI_FLOAT).
      This patch makes the RMA work correctly with pair basic types.
      
      The bug is that: (1) when performing the ACC computation, the original
      implementation uses 'eltype' in the datatype structure, which is set
      when all basic elements in this datatype have the same builtin
      datatype. When basic elements have different builtin datatypes, like
      pair datatypes, the 'eltype' is set to MPI_DATATYPE_NULL. This makes
      the ACC computation be unable to work with pair types; (2) for all
      basic type of data, the original implementation assumes that
      they are all contiguous and issues them in an unpacked manner
      with length of data size (count*type_size). This is incorrect for
      pair datatypes, because most pair datatypes are non-contiguous
      (type_extent != type_size).
      
      In the previous patch, we already made 'eltype' to store basic
      type instead of builtin type. In this patch, we fixed this
      bug by (1) modify ACC computation to treat 'eltype' as basic
      type; (2) For non-contiguous basic type data, we use the noncontig
      API so that it will be issued in a packed manner.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      ce8bc310
    • Xin Zhao's avatar
      Simplify code: deleting derived DT code for op piggybacked with LOCK. · 2317b31d
      Xin Zhao authored
      
      
      We piggyback LOCK flag with operations that does not use
      derived datatypes. Therefore, here we delete the unnecessary
      code that deal with derived datatypes in piggyback LOCK code.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      2317b31d
    • Xin Zhao's avatar
      Simplify code: not using flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP for GACC/FOP. · 344bf958
      Xin Zhao authored
      
      
      Flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is used to tell the target
      if the response packet of current GET, GACC and FOP should use
      IMMED packet type. We use IMMED packet type only when
      origin/target/result datatypes are all basic types.
      Since the target does not know origin/result datatypes, origin
      process needs to set a flag to inform the target.
      
      However, this usage is redundant for GACC and FOP packets. The
      reason is that, when we use IMMED packet type for GACC/FOP packets,
      origin/target/result datatypes must be basic types,
      in such case, we must use IMMED packet type for response packets
      as well, and usage of MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP and
      related code is not necessary. In short,
      flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is useful only for GET operation.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      344bf958
  3. 03 Mar, 2015 1 commit
  4. 13 Feb, 2015 7 commits
    • Xin Zhao's avatar
      Remove source_win_handle from GET-like RMA packets. · 80a71e11
      Xin Zhao authored
      
      
      For GET-like RMA packets and response packets (GACC,
      GET, FOP, CAS, GACC_RESP, GET_RESP, FOP_RESP, CAS_RESP),
      originally we carry source_win_handle in packet struct
      in order to locate window handle on origin side in the
      packet handler of response packets. However, this is
      not necessary because source_win_handle can be stored
      in the request on the origin side. This patch delete
      source_win_handle from those packets to reduce the size
      of packet union.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      80a71e11
    • Xin Zhao's avatar
    • Xin Zhao's avatar
      Bug-fix: use do_accumulate_op function for ACC computation. · c8ecef8d
      Xin Zhao authored
      
      
      do_accumulate_op() does more comprehensive work on ACC
      computation than OP function. For example, MPI_REPLACE
      is not defined as predefined computation and therefore
      not handled by OP function, but it is safely handled
      in do_accumulate_op(). This patch replace OP function
      with do_accumulate_op() on target side.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c8ecef8d
    • Xin Zhao's avatar
      Change argument of function finish_op_on_target. · 1b30ab19
      Xin Zhao authored
      
      
      In this patch, we replace one argument of function
      finish_op_on_target, "packet(op) type", with "has_response_data".
      Since finish_op_on_target does not care what specific
      packet(op) type it is processing on, but only cares
      about if the current op has response data (like GET/GACC),
      changing the argument in this way can simplify the
      code by avoiding acquiring packet(op) type everytime
      before calling finish_op_on_target.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      1b30ab19
    • Xin Zhao's avatar
      Add asserts for RMA packet types. · 21479b00
      Xin Zhao authored
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      21479b00
    • Xin Zhao's avatar
      Rewrite code of piggybacking IMMED data with RMA packets. · de9d0f21
      Xin Zhao authored
      
      
      Originally we add "immed_data" and "immed_len" areas to RMA packets,
      in order to piggyback small amount of data with packet header to
      reduce number of packets (Note that "immed_len" is necessary when
      the piggybacked data is not the entire data). However, those areas
      potentially increase the packet union size and worsen the two-sided
      communication. This patch fixes this issue.
      
      In this patch, we remove "immed_data" and "immed_len" from normal
      "MPIDI_CH3_Pkt_XXX_t" operation type (e.g. MPIDI_CH3_Pkt_put_t), and
      we introduce new "MPIDI_CH3_Pkt_XXX_immed_t" packt type for each
      operation (e.g. MPIDI_CH3_Pkt_put_immed_t).
      
      "MPIDI_CH3_Pkt_XXX_immed_t" is used when (1) both origin and target
      are basic datatypes, AND, (2) the data to be sent can be entirely fit
      into the header. By doing this, "MPIDI_CH3_Pkt_XXX_immed_t" needs
      "immed_data" area but can drop "immed_len" area. Also, since it only
      works with basic target datatype, it can drop "dataloop_size" area
      as well. All operations that do not satisfy (1) or (2) will use
      normal "MPIDI_CH3_Pkt_XXX_t" type.
      
      Originally we always piggyback FOP data into the packet header,
      which makes the packet size too large. In this patch we split the
      FOP operaton into IMMED packets and normal packets.
      
      Because CAS only work with 2 basic datatype and non-complex
      elements, the data amount is relatively small, we always piggyback
      the data with packet header and only use "MPIDI_CH3_Pkt_XXX_immed_t"
      packet type for CAS.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      de9d0f21
    • Xin Zhao's avatar
      Remove lock_type and origin_rank areas from RMA packet. · 81e2b274
      Xin Zhao authored
      
      
      Originally we added lock_type and origin_rank areas
      in RMA packet, in order to piggyback passive lock request
      with RMA operations. However, those areas potentially
      enlarged the packet union size, and actually they are
      not necessary and can be completetly avoided.
      
      "Lock_type" is used to remember what types of lock (shared or
      exclusive) the origin wants to acquire on the target. To remove
      it from RMA packet, we use flags (already exists in RMA packet)
      to remember such information.
      
      "Origin_rank" is used to remember which origin has sent lock
      request to the target, so that when the lock is granted to this
      origin later, the target can send ack to that origin. Actually
      the target does not need to store origin_rank but can only store
      origin_vc, which is known from progress engine on target side.
      Therefore, we can completely remove origin_rank from RMA packet.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      81e2b274
  5. 08 Feb, 2015 1 commit
    • Xin Zhao's avatar
      Bug-fix: guarantee atomicity for FOP and GACC. · bad898f9
      Xin Zhao authored
      
      
      FOP, CAS and GACC are atomic "read-modify-write" operations,
      which means when the target window is defined on a SHM region,
      we need inter-process lock to guarantee the atomicity of the
      entire "read+OP". The current implementation is correct for
      SHM-based RMA operations, but not correct for AM-based RMA
      operations: for SHM-based operations, it protects the entire
      "read+OP", but for AM-based operations, it only protects the
      "OP" part.
      
      This patch fixes this issue by protecting the memory copy to
      temporary buffer and computation together for AM-based operations.
      
      Fix ticket 2226
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      bad898f9
  6. 16 Dec, 2014 14 commits
  7. 24 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Bug-fix: preventing completing the same RMA request twice. · 8a0887b9
      Xin Zhao authored
      
      
      It is possible that a request handler of RMA request is
      called for the second time inside the first called request
      handler on the same request.
      
      Consider the following case: a req is queued up in Nemesis
      SHM queue with ref count of 2: one is for request completion
      and another is for dequeueing from SHM queue. The first
      called req handler completed this request and decrement ref
      count to 1. This request is still in the queue. However,
      within this handler, we trigger the same req handler on the
      same request again (for example making progress on SHM queue),
      and the second called handler also tries to complete this
      request, which leads to the wrong execution.
      
      In this patch we check if request has already been completed
      when entering the req handler, to prevent processing the same
      request twice. We also move the function finish_op_on_target()
      (where the same req handler can be triggered again)
      after request completion routine, so that we can mark the
      current request as completed before enter the same req handler
      for the second time.
      
      Fix #2204
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      8a0887b9
  8. 13 Nov, 2014 3 commits
  9. 03 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Delete no longer needed code. · cc63b367
      Xin Zhao authored
      
      
      We made a huge change to RMA infrastructure and
      a lot of old code can be droped, including separate
      handlers for lock-op-unlock, ACCUM_IMMED specific
      code, O(p) data structure code, code of lazy issuing,
      etc.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      cc63b367