1. 13 Jan, 2015 1 commit
    • Wesley Bland's avatar
      Remove ADI breakage introduced earlier · 6f646ca0
      Wesley Bland authored
      
      
      There was an accidental ADI breakage earlier when MPI level codes would
      query into the dev part of the MPID request object. This commit removes
      that breakage by adding a new macro into the mpiimpl.h file to portably
      check whether a request is anysource. For now, in pamid, this macro
      always evaluates to 0. This can easily be fixed by overwriting it in the
      pamid code, but since pamid doesn't support FT, it won't have any
      functional change either.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      6f646ca0
  2. 12 Jan, 2015 4 commits
    • Wesley Bland's avatar
      Change MPIDI_CH3I_Comm_AS_enabled to be MPID level · 8cbbcae4
      Wesley Bland authored
      
      
      This macro was used inside CH3 to determine if the communicator could be
      used for anysource communication. With the rewrite of the anysource
      fault tolerance logic, it is now necessary to use it at the MPI level.
      Because it is a macro and not a function, the macro is defined in
      mpiimple.h as (1) and then overwritten in the ch3 device. Future devices
      can also overwrite it if desired.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      8cbbcae4
    • Wesley Bland's avatar
      Break out of progress for anysource failures · 50d85e51
      Wesley Bland authored
      
      
      If a failure is detected, even if no request is actually complete, the
      completion counter will be incremented now as a way to give control back
      to the MPI layer to let it decide whether or not to continue.
      
      This gives the request completion functions a chance to see if they're
      waiting on an MPI_ANY_SOURCE request and if so, to return an error
      indicating that the completion function has a
      MPIX_ERR_PROC_FAILED_PENDING failure that the user needs to acknowledge.
      
      All of these functions should go into the progress engine at least once
      as a way to ensure that even if they will be returning an error, they'll
      at least give MPI a way to make progress and potentially still complete
      the request objects even if the user never acknowledges the failure.
      
      A follow on commit will add the functionality to keep the progress
      engine from getting stuck if a failure is discovered before entering the
      completion function.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      50d85e51
    • Wesley Bland's avatar
      Strip out pending ANY_SOURCE request handling · 7a785c84
      Wesley Bland authored
      
      
      The existing way that we handle non-blocking requests involving wildcard
      receive operations is incorrect. We're cancelling request operations and
      trying to recreate them later. In the meantime, it's messing with
      matching and makes it possible (likely?) that some messages that arrive
      will never be matched. A new way of handling this is coming next.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      7a785c84
    • Wesley Bland's avatar
      Don't free a request if it still pending · a96ac72e
      Wesley Bland authored
      
      
      If we had a failure that caused a request to be pending, we were freeing
      the request before calling the error handler. That caused segfaults. Now
      we switch the ordering of the two to avoid that.
      
      This also moves the assignment of the status_ptr to be a little earlier
      to avoid another segfault.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      a96ac72e
  3. 06 Nov, 2014 1 commit
    • Wesley Bland's avatar
      Check for pending any source ops · c2be640e
      Wesley Bland authored
      
      
      Before calling the progress engine, make sure none of the operations
      should return an error for MPIX_ERR_PROC_FAILED_PENDING. They would
      cause the progress engine to hang (potentially) so we can't enter it.
      Instead, mark the appropriate error codes and return immediately.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      c2be640e
  4. 07 Jul, 2014 1 commit
  5. 05 Nov, 2012 1 commit
  6. 10 Oct, 2012 1 commit
  7. 20 Sep, 2012 1 commit
    • Pavan Balaji's avatar
      [svn-r10247] Get rid of duplicate jump on failure for the MPIR_ERRTEST_ macros. · 06397126
      Pavan Balaji authored
      In several places, after checking for a parameter (e.g., comm) we were
      directly using it assuming that the parameter is valid.  Since the
      previous ERRTEST macros did not jump to fn_fail on an error, this
      could result in undefined behavior if the parameter was invalid.  Now,
      since we jump on errors within the macros themselves, once the check
      is done, we know that the parameter values are valid.
      
      Reviewed by buntinas.
      06397126
  8. 04 Feb, 2011 1 commit
  9. 27 Jul, 2010 1 commit
    • David Goodell's avatar
      [svn-r6919] completion counter cleanup (adds MPID_cc_t) · 0a5c22ae
      David Goodell authored
      When compiled for fine-grained threading, the completion counter serves
      as a form of lockfree signalling.  As such, atomic access and memory
      barriers must be used to ensure correctness.
      
      In per-object mode, this code also contains valgrind client request annotations
      to inform Helgrind/DRD/TSan about the lockfree signalling pattern.
      
      No reviewer.
      0a5c22ae
  10. 20 May, 2010 1 commit
  11. 17 Sep, 2009 1 commit
  12. 16 May, 2009 1 commit
  13. 22 Sep, 2008 1 commit
  14. 02 Nov, 2007 1 commit