1. 30 May, 2015 2 commits
    • Xin Zhao's avatar
      Add extended packet header in CH3 layer used by RMA messages · 25e40e43
      Xin Zhao authored
      
      
      Here we added extended packet header in CH3 layer used to
      transmit attributes that are only needed in RMA and are not
      needed in two-sided communication. The key implementation
      details are listed as follows:
      
      Origin side:
      
      (1) The extended packet header is stored in the request, and
      the request is passed to the issuing function (iSendv() or
      sendNoncontig_fn()) in the lower layer. The issuing function
      checks if the extended packet header exists in the request,
      if so, it will issue that header. (The modifications in lower
      layer are in the next commit.)
      
      (2) There is a fast path used when (origin data is contiguous &&
      target data is predefined && extended packet header is not used).
      In such case, we do not need to create a request beforehand
      but can use iStartMsgv() issuing function which try to issue
      the entire message as soon as possible.
      
      Target side:
      
      (1) There are two req handler being used when extended packet header
      is used or target datatype is derived. The first req handler is
      triggered when extended packet header / target datatype info is
      arrived, and the second req handler is triggered when actual data
      is arrived.
      
      (2) When target side receives a stream unit which is piggybacked with
      LOCK, it will drop the stream_offset in extended packet header, since
      the stream unit must be the first one and stream_offset must be 0.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      25e40e43
    • Xin Zhao's avatar
      Revert "Move 'stream_offset' out of RMA packet struct." · 6f62c424
      Xin Zhao authored
      This reverts commit 19f29078
      
      .
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      6f62c424
  2. 20 Apr, 2015 2 commits
    • Xin Zhao's avatar
      Set size of IMMED data in RMA packets to 8 bytes. · de0412c2
      Xin Zhao authored
      
      
      Originally the size of IMMED data in RMA packets is 16 bytes
      which makes the size of CH3 packet be 56 bytes. Here we reduce
      the size of IMMED data in RMA packets to 8 bytes, so that the
      size of CH3 packet is reduced to 48 bytes, the same with
      mpich-3.1.4 (the old RMA infrastructure).
      Signed-off-by: default avatarMin Si <msi@il.is.s.u-tokyo.ac.jp>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      de0412c2
    • Xin Zhao's avatar
      Move 'stream_offset' out of RMA packet struct. · 19f29078
      Xin Zhao authored
      
      
      'stream_offset' is used to specify the starting position
      (on target window) of the current streaming unit in ACC-like
      operations. It is originally put in the RMA packet struct,
      which potentially increases the size of CH3 packet size.
      
      In this patch, we move 'stream_offset' out of the RMA
      packet as follows: 1. when target data is basic datatype,
      we use 'stream_offset' and the starting address for the entire
      operation to calculate the starting address for current
      streaming unit, and rewrite 'addr' in RMA packet with that
      value; 2. when target data is derived datatype, we cannot do
      the same thing as basic datatype because the target needs to
      know both the starting address for the entire operation and
      the starting address for the current streaming unit. Therefore,
      we send 'stream_offset' separately to the target side.
      Signed-off-by: default avatarMin Si <msi@il.is.s.u-tokyo.ac.jp>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      19f29078
  3. 04 Mar, 2015 2 commits
  4. 03 Mar, 2015 1 commit
  5. 13 Feb, 2015 9 commits
    • Xin Zhao's avatar
      Delete comments that no longer make sense. · 21126e9e
      Xin Zhao authored
      
      
      The comments are no longer significant for
      new RMA infrastructure.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      21126e9e
    • Xin Zhao's avatar
      Remove source_win_handle from GET-like RMA packets. · 80a71e11
      Xin Zhao authored
      
      
      For GET-like RMA packets and response packets (GACC,
      GET, FOP, CAS, GACC_RESP, GET_RESP, FOP_RESP, CAS_RESP),
      originally we carry source_win_handle in packet struct
      in order to locate window handle on origin side in the
      packet handler of response packets. However, this is
      not necessary because source_win_handle can be stored
      in the request on the origin side. This patch delete
      source_win_handle from those packets to reduce the size
      of packet union.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      80a71e11
    • Xin Zhao's avatar
      Rewrite code of piggybacking IMMED data with RMA packets. · de9d0f21
      Xin Zhao authored
      
      
      Originally we add "immed_data" and "immed_len" areas to RMA packets,
      in order to piggyback small amount of data with packet header to
      reduce number of packets (Note that "immed_len" is necessary when
      the piggybacked data is not the entire data). However, those areas
      potentially increase the packet union size and worsen the two-sided
      communication. This patch fixes this issue.
      
      In this patch, we remove "immed_data" and "immed_len" from normal
      "MPIDI_CH3_Pkt_XXX_t" operation type (e.g. MPIDI_CH3_Pkt_put_t), and
      we introduce new "MPIDI_CH3_Pkt_XXX_immed_t" packt type for each
      operation (e.g. MPIDI_CH3_Pkt_put_immed_t).
      
      "MPIDI_CH3_Pkt_XXX_immed_t" is used when (1) both origin and target
      are basic datatypes, AND, (2) the data to be sent can be entirely fit
      into the header. By doing this, "MPIDI_CH3_Pkt_XXX_immed_t" needs
      "immed_data" area but can drop "immed_len" area. Also, since it only
      works with basic target datatype, it can drop "dataloop_size" area
      as well. All operations that do not satisfy (1) or (2) will use
      normal "MPIDI_CH3_Pkt_XXX_t" type.
      
      Originally we always piggyback FOP data into the packet header,
      which makes the packet size too large. In this patch we split the
      FOP operaton into IMMED packets and normal packets.
      
      Because CAS only work with 2 basic datatype and non-complex
      elements, the data amount is relatively small, we always piggyback
      the data with packet header and only use "MPIDI_CH3_Pkt_XXX_immed_t"
      packet type for CAS.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      de9d0f21
    • Xin Zhao's avatar
      Remove lock_type and origin_rank areas from RMA packet. · 81e2b274
      Xin Zhao authored
      
      
      Originally we added lock_type and origin_rank areas
      in RMA packet, in order to piggyback passive lock request
      with RMA operations. However, those areas potentially
      enlarged the packet union size, and actually they are
      not necessary and can be completetly avoided.
      
      "Lock_type" is used to remember what types of lock (shared or
      exclusive) the origin wants to acquire on the target. To remove
      it from RMA packet, we use flags (already exists in RMA packet)
      to remember such information.
      
      "Origin_rank" is used to remember which origin has sent lock
      request to the target, so that when the lock is granted to this
      origin later, the target can send ack to that origin. Actually
      the target does not need to store origin_rank but can only store
      origin_vc, which is known from progress engine on target side.
      Therefore, we can completely remove origin_rank from RMA packet.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      81e2b274
    • Xin Zhao's avatar
      Add comments about RMA packet wrappers. · d46b848a
      Xin Zhao authored
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      d46b848a
    • Xin Zhao's avatar
      Modify packet wrappers to make them complete. · 064e60ce
      Xin Zhao authored
      
      
      Some packet wrappers did not include all packet types,
      this patch adds missed packet types to those wrappers.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      064e60ce
    • Xin Zhao's avatar
      Re-apply modifications on mpidpkt.h. · fa958833
      Xin Zhao authored
      This patch re-apply modifications on mpidpkt.h that is
      temporarily reverted in bb3f9623
      
      .
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      fa958833
    • Xin Zhao's avatar
      Revert "Code-refactor: arrange RMA pkt structure." · 2cbc9180
      Xin Zhao authored
      This reverts commit 389aab16
      
      .
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      2cbc9180
    • Xin Zhao's avatar
      Temporarily revert commits for src/mpid/ch3/include/mpidpkt.h · bb3f9623
      Xin Zhao authored
      We are going to revert the commit 389aab16 because it re-ordered
      the attributes in RMA packet structs in mpidpkt.h and messed up
      the alignments.
      
      This commit temporarily reverts the following commits, which
      only reverts modification on mpidpkt.h after commit 389aab16.
      
      e36203c3, 45afd1fd, 3a05784f, 87acbbbe, b155e7e0
      
      We will re-apply those modifications after we revert 389aab16
      
      .
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      bb3f9623
  6. 16 Dec, 2014 12 commits
    • Xin Zhao's avatar
      Support handling different LOCK ACKs · 45afd1fd
      Xin Zhao authored
      No reviewer.
      45afd1fd
    • Xin Zhao's avatar
      Delete MPIDI_CH3_PKT_FLAG_RMA_UNLOCK_ACK flag · 97022653
      Xin Zhao authored
      The behavior of UNLOCK_ACK flag is exactly the same
      with the behavior of FLUSH_ACK, so here we just delete
      UNLOCK_ACK flag and use FLUSH_ACK flag for all FLUSH
      ACK packets.
      
      No reviewer.
      97022653
    • Xin Zhao's avatar
      Modify ACK of op with both LOCK and UNLOCK (FLUSH) flags · 03ebc97b
      Xin Zhao authored
      No reviewer.
      03ebc97b
    • Xin Zhao's avatar
      Modify send_lock_ack_pkt function to contain flags. · 01679120
      Xin Zhao authored
      No reviewer.
      01679120
    • Xin Zhao's avatar
      Add new pkt flags for different LOCK ACKs. · faae55ad
      Xin Zhao authored
      Add new flags for four different kinds of LOCK ACKs:
      
      (1) LOCK_GRANTED: lock is granted on target.
      (2) LOCK_QUEUED_DATA_QUEUED: lock is not granted on target,
          but it is safely queued on target. If this lock request
          is sent with an RMA operation, the operation data is also
          safely queued on target.
      (3) LOCK_QUEUED_DATA_DISCARDED: lock is not granted on target,
          but it is safely queued on target. If this lock request
          is sent with an RMA operation, the operation data is discarded
          on target due to out of resources.
      (4) LOCK_DISCARDED: lock is not granted on target, and it is
          not queued up on target due to out of resources. If this
          lock request is set with an RMA opration, the operation data
          is also discarded on target.
      
      No reviewer.
      faae55ad
    • Xin Zhao's avatar
      Change routine/pkt name from LOCK_GRANTED to LOCK_ACK · e36203c3
      Xin Zhao authored
      Because we will send different kinds of LOCK ACKs (not
      just LOCK_GRANTED, but maybe LOCK_DISCARDED, for example),
      so naming related packets and function as "LOCK_GRANTED"
      is not proper anymore. Here we rename them to "LOCK_ACK".
      
      No reviewer.
      e36203c3
    • Xin Zhao's avatar
      Bug-fix: add pkt type LOCK in GET_TARGET_WIN_HANDLE macro · 385f0aae
      Xin Zhao authored
      No reviewer.
      385f0aae
    • Xin Zhao's avatar
      Use int instead of size_t in RMA pkt header. · 3a05784f
      Xin Zhao authored
      Use int instead of size_t in RMA pkt header to reduce
      packet size.
      
      No reviewer.
      3a05784f
    • Xin Zhao's avatar
      Bug-fix: add IMMED area in GET/GACC response packets · 87acbbbe
      Xin Zhao authored
      In this patch we allow GET/GACC response packets to
      piggyback some IMMED data, just like what we did
      for PUT/GACC/FOP/CAS packets.
      
      No reviewer.
      87acbbbe
    • Xin Zhao's avatar
      Perf-optimize: support piggybacking LOCK on large RMA operations. · 4739df59
      Xin Zhao authored
      Originally we only allows LOCK request to be piggybacked
      with small RMA operations (all data can be fit in packet
      header). This brings communication overhead for larger
      operations since origin side needs to wait for the LOCK
      ACK before it can transmit data to the target.
      
      In this patch we add support of piggybacking LOCK with
      RMA operations with arbitrary size. Note that (1) this
      only works with basic datatypes; (2) if the LOCK cannot
      be satisfied, we temporarily buffer this operation on
      the target side.
      
      No reviewer.
      4739df59
    • Xin Zhao's avatar
      Clean up unused attributes in RMA packet structs. · b155e7e0
      Xin Zhao authored
      No reviewer.
      b155e7e0
    • Xin Zhao's avatar
      Code-refactor: arrange RMA pkt structure. · 389aab16
      Xin Zhao authored
      Arrange RMA packet definition and structures in
      src/mpid/ch3/include/mpidpkt.h in the following
      order:
      
      1. RMA operation packets: PUT, GET, ACC, GACC, CAS, FOP
      2. RMA operation response packets: GET_RESP, GACC_RESP, CAS_RESP, FOP_RESP
      3. RMA control packets: LOCK, UNLOCK, FLUSH, DECR_AT_COUNTER
      4. RMA control response packets: LOCK_ACK, FLUSH_ACK
      
      No reviewer.
      389aab16
  7. 13 Nov, 2014 1 commit
    • Xin Zhao's avatar
      Perf-tuning: issue FLUSH, FLUSH ACK, UNLOCK ACK messages only when needed. · a9d968cc
      Xin Zhao authored
      
      
      When operation pending list and request lists are all empty, FLUSH message
      needs to be sent by origin only when origin issued PUT/ACC operations since
      the last synchronization calls, otherwise origin does not need to issue FLUSH
      at all and does not need to wait for FLUSH ACK message.
      
      Similiarly, origin waits for ACK of UNLOCK message only when origin issued
      PUT/ACC operations since the last synchronization calls. However, UNLOCK
      message always needs to be sent out because origin needs to unlock the
      target process. This patch avoids issuing unnecessary
      FLUSH / FLUSH ACK / UNLOCK ACK messages.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      a9d968cc
  8. 03 Nov, 2014 11 commits
    • Xin Zhao's avatar
      Delete no longer needed code. · cc63b367
      Xin Zhao authored
      
      
      We made a huge change to RMA infrastructure and
      a lot of old code can be droped, including separate
      handlers for lock-op-unlock, ACCUM_IMMED specific
      code, O(p) data structure code, code of lazy issuing,
      etc.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      cc63b367
    • Xin Zhao's avatar
      Rewrite code of passive lock control messages. · 0542e304
      Xin Zhao authored
      
      
      1. Piggyback LOCK request with first IMMED operation.
      
      When we see an IMMED operation, we can always piggyback
      LOCK request with that operation to reduce one sync
      message of single LOCK request. When packet header of
      that operation is received on target, we will try to
      acquire the lock and perform that operation. The target
      either piggybacks LOCK_GRANTED message with the response
      packet (if available), or sends a single LOCK_GRANTED
      message back to origin.
      
      2. Rewrite code of manage lock queue.
      
      When the lock request cannot be satisfied on target,
      we need to buffer that lock request on target. All we
      need to do is enqueuing the packet header, which contains
      all information we need after lock is granted. When
      the current lock is released, the runtime will goes
      over the lock queue and grant the lock to the next
      available request. After lock is granted, the runtime
      just trigger the packet handler for the second time.
      
      3. Release lock on target side if piggybacking with UNLOCK.
      
      If there are active-message operations to be issued,
      we piggyback a UNLOCK flag with the last operation.
      When the target recieves it, it will release the current
      lock and grant the lock to the next process.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      0542e304
    • Xin Zhao's avatar
      Reset the start of the enum to 0. · 7fbe72dd
      Xin Zhao authored
      
      
      We must make the initial value of enum to zero because some places
      check number of packet types by checking ending type value.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      7fbe72dd
    • Xin Zhao's avatar
      Rearrange enum of pkt types. · be3e5bdd
      Xin Zhao authored
      
      
      Rearrange the ordering of packet types so that all RMA issuing types
      can be placed together. This is convenient when we check if currently
      involved packets are all RMA packets.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      be3e5bdd
    • Xin Zhao's avatar
      Add IMMED area in packet header. · e8d4c6d5
      Xin Zhao authored
      
      
      We add a IMMED data area (16 bytes by default) in
      packet header which will contains as much origin
      data as possible. If origin can put all data in
      packet header, then it no longer needs to send
      separate data packet. When target recieves the
      packet header, it will first copy data out from
      the IMMED data area. If there is still more
      data coming, it continues to receive following
      packets; if all data is included in header, then
      recieving is done.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      e8d4c6d5
    • Xin Zhao's avatar
      Add useful pkt wrappers. · 1c638a12
      Xin Zhao authored
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      1c638a12
    • Xin Zhao's avatar
      Decrement Active Target counter at target side. · b73778ea
      Xin Zhao authored
      
      
      During PSCW, when there are active-message operations
      to be issued in Win_complete, we piggback a AT_COMPLETE
      flag with it so that when target receives it, it can
      decrement a counter on target side and detect completion
      when target counter reaches zero.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      b73778ea
    • Xin Zhao's avatar
      Detect remote completion by FLUSH / FLUSH_ACK messages. · 6578785d
      Xin Zhao authored
      
      
      When the origin wants to do a FLUSH sync, if there are
      active-message operations that are going to be issued,
      we piggback the FLUSH message with the last operation;
      if no such operations, we just send a single FLUSH packet.
      
      If the last operation is a write op (PUT, ACC) or only
      a single FLUSH packet is sent, after target recieves it,
      target will send back a single FLUSH_ACK packet;
      if the last operation contains a read action (GET, GACC, FOP,
      CAS), after target receiveds it, target will piggback a
      FLUSH_ACK flag with the response packet.
      
      After origin receives the FLUSH_ACK packet or response packet
      with FLUSH_ACK flag, it will decrement the counter which
      indicates number of outgoing sync messages (FLUSH / UNLOCK).
      When that counter reaches zero, origin can know that remote
      completion is achieved.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      6578785d
    • Xin Zhao's avatar
      Split shared RMA packet structures. · c0094faa
      Xin Zhao authored
      
      
      Previously several RMA packet types share the same structure,
      which is misleading for coding. Here make different
      RMA packet types use different packet data structures.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      c0094faa
    • Xin Zhao's avatar
      Embedding packet structure into RMA operation structure. · b1685139
      Xin Zhao authored
      
      
      We were duplicating information in the operation structure and in the
      packet structure when the message is actually issued.  Since most of
      the information is the same anyway, this patch just embeds a packet
      structure into the operation structure, so that we eliminate unnessary
      copy.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      b1685139
    • Xin Zhao's avatar
      Rename ACK packets in RMA. · ba1a400c
      Xin Zhao authored
      
      
      The packet type MPIDI_CH3_PKT_PT_RMA_DONE is used for ACK
      of FLUSH / UNLOCK packets. Here we rename it to
      MPIDI_CH3_PKT_FLUSH_ACK and modify the related functions
      and data structures.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
      ba1a400c