1. 30 May, 2015 8 commits
  2. 27 May, 2015 2 commits
  3. 26 May, 2015 1 commit
  4. 20 May, 2015 3 commits
  5. 13 May, 2015 2 commits
  6. 10 May, 2015 2 commits
  7. 29 Apr, 2015 1 commit
  8. 27 Apr, 2015 5 commits
    • Valentin Petrov's avatar
      OFI: Bug fix for RTS/CTS/DATA protocol. · 2069c15e
      Valentin Petrov authored
      
      
      MPID_nem_ofi_data_callback used to check sreq->cc in order to track progress of
      the RTS/CTS/DATA protocol. The was an implicit assumption that fi_tsend with RTS
      completes first. However this would cause a hang if fi_trecv completed earlier.
      The fix is: don't rely on the cc but rather check the tag bits explicitly.
      Note, the RTS/CTS/DATA bits are no longer accumulated (i.e., no more
      "wc->tag | CTS/DATA").
      Signed-off-by: default avatarCharles J Archer <charles.j.archer@intel.com>
      2069c15e
    • Valentin Petrov's avatar
      OFI: MPIR_Barrier_impl should not be called from MPID_nem_ofi_finalize. · 34e57aa8
      Valentin Petrov authored
      
      
      It uses nemesis shared memory which is already cleaned up at this stage.
      However, w/o any synchronization a hang in the close protocol is possible
      since rts/cts/data messages may be on the fly. This change fixes the issue.
      Signed-off-by: default avatarCharles J Archer <charles.j.archer@intel.com>
      34e57aa8
    • Valentin Petrov's avatar
    • Valentin Petrov's avatar
    • Valentin Petrov's avatar
      OFI: Add support for large tags using immediate data and OFI tag layouts · ec920e5f
      Valentin Petrov authored
      
      
      This patch modifies the OFI netmod to support large tag layouts, while preserving the old
      tag layout.  OFI defines a 64 bit tag, but also provides for a 64 bit tag and immediate data.
      In some OFI providers, we may want to select different tag layouts.  This patch currently
      does not query for the proper tag layout or attempt to make a choice of the optimal layout,
      it provides macro/templatized support for different tag formats.  Additional selection
      criteria will be added in subsequent patches.
      
        * Tag layout is moved to a separate file.
          Added init_sendtag_M2, init_recvtag_M2 (M2 stands for MODE #2, i.e. the mode
          that uses fi_tsenddata and does not pack source into tag).
      
        * Created a template file for ofi_tagged.c
          Moved do_isend into template file which is included twice into ofi_tagged.c thus providing for the two
          versions of do_isend and do_isend_2 corresponding to the two API sets.
      
        * All send functions are available in two versions.
          Added macro that declares a function for the two API sets. The first set has the namings inherited from
          the previous netmod version. The functions of the second API set have the "_2" suffix.
      
        * Recv_posted, anysource_posted, recv_callback, ofi_probe  are templatized.
      
        * ofi_tag_to_vc renamed ofi_wc_to_vc
          Note, for the API_SET_2 the pgid is stored in the imm data while
          psource and port will be packed the same way as in API_SET_1.
      
        * Adds api_set member in gl_data struct.  Initialize routines based on api_set
      
        * Added RCD (RtsCtsData) protocol identifiers
      
        * Added support for OFI MEM_TAG_FORMAT
      
        * PGID placement modified
      Signed-off-by: default avatarCharles J Archer <charles.j.archer@intel.com>
      ec920e5f
  9. 24 Apr, 2015 4 commits
  10. 22 Apr, 2015 4 commits
    • Pavan Balaji's avatar
      Fix arbitrary poll count before yielding. · abb56764
      Pavan Balaji authored
      
      
      Instead of polling for an arbitrarily decided number of times in the
      progress engine before yielding, we now moved the yielding
      intelligence to the threading layer.  The threading layer can keep
      track of other threads that are waiting to enter the critical section
      and only yield if another thread is waiting.  In this way, if no
      thread is waiting to get the lock, the main thread never yields.  At
      the same time, if another thread is waiting to get a lock, there is no
      delay in yielding.
      
      This change, however, introduces possible deadlocks. If a thread enters
      MPIDI_CH3I_progress with is_blocking unset, it may set the
      MPIDI_CH3I_progress_blocked flag and then will yield the critical section.
      Another thread may enter with is_blocking set, find the flag
      MPIDI_CH3I_progress_blocked set, and block in the conditional variable.
      The first thread will wake up and leave the progress engine without
      emitting any signal to wake up the second thread which may sleep forever.
      
      A simple fix is to yield the critical section only if the current thread
      entered the progress engine with is_blocking set.
      Signed-off-by: default avatarHalim Amer <aamer@anl.gov>
      abb56764
    • Pavan Balaji's avatar
      Initial version of the intelligent thread yielding. · b39314a5
      Pavan Balaji authored
      
      
      Instead of a simple thread yield, this patch adds some additional
      information to the yield about how many threads are waiting for it.
      When a thread tries to acquire a lock, they increment a counter.  When
      a thread needs to yield, it can check this counter to see how many
      threads are waiting to get the lock.  If there are no threads waiting,
      the yield can be skipped.
      
      This patch contains various changes to make that happen:
      
      1. We modify the mutex object to maintain additional information on
      the number of queued threads.
      
      2. We improve the yield call to include the unlock and lock as well,
      since it needs to decide whether to do the unlock/lock based on how
      many other threads are queued up.
      Signed-off-by: default avatarHalim Amer <aamer@anl.gov>
      b39314a5
    • Pavan Balaji's avatar
      Cleanup threaded progress. · f385680e
      Pavan Balaji authored
      
      
      The nemesis progress engine was written in a way so that if one thread
      is inside a progress engine, other threads cannot enter the receive
      progress.  They can enter the send progress in some cases.  There
      doesn't seem to be a good reason for this behavior.  This patch
      combines this so threads would simply return for nonblocking
      operations and wait for a signal before entering the progress engine
      for blocking operations.
      Signed-off-by: default avatarHalim Amer <aamer@anl.gov>
      f385680e
    • Kenneth Raffenetti's avatar
      mxm: fix anysource_matched · 1be5fc49
      Kenneth Raffenetti authored
      
      
      The return value of anysource_matched should be the actual result
      of the cancel operation. If the result is uncancelable, i.e. already
      matched, then CH3 will let the netmod message win and move on to the
      other requests in the queue. When the completion for the unsuccessfully
      canceled message comes in, we process it like normal.
      Reviewed-by: default avatarIgor Ivanov <Igor.Ivanov@itseez.com>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      1be5fc49
  11. 20 Apr, 2015 8 commits