1. 26 Nov, 2014 2 commits
  2. 24 Nov, 2014 2 commits
    • Paul Coffman's avatar
      romio gpfs: select correct read buffer · 230c2df3
      Paul Coffman authored and Rob Latham's avatar Rob Latham committed
      ROMIO GPFSMPIO_P2PCONTIG threaded read needs to toggle first read buffer
      optimizations there was a correctness bug when reading where for the
      first round the read buffer did not toggle to the two-phase buffer for
      the pthread reader, resulting in diseminating the data from the wrong
      buffer.  The fix is to do the toggle after the first read.
      Signed-off-by: default avatarPaul Coffman <pkcoff@us.ibm.com>
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
    • William Gropp's avatar
      Make ROMIO htmldocs update link file · e645371f
      William Gropp authored and Rob Latham's avatar Rob Latham committed
      Update the use of DOCTEXT to match the rest of MPICH, including adding
      -nolocation (drop the location of the source file from the documentation)
      and ensure that the mpi.cit file contains the I/O routines as well as
      the others (this file can be used to add links to the man pages in
      other documents).
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
  3. 14 Nov, 2014 1 commit
  4. 13 Nov, 2014 2 commits
  5. 12 Nov, 2014 3 commits
    • Wesley Bland's avatar
      Correctly handle errflag in MPI collectives · 47f62b0c
      Wesley Bland authored
      The MPI collectives get and set the errflag used by the collective
      helper functions (MPIC_*). The possible values of the errflag changed,
      so the collective functions need to appropriately set this value using
      This should allow collectives to correctly report process failures when
      they occur now, fixing the FT tests that use collectives (see #1945).
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
    • Wesley Bland's avatar
      Change errflag to be an enum · 3850e6bf
      Wesley Bland authored
      The errflag value being used in the MPIC helper functions only
      propagated whether or not an error occurred. It did not contain any
      information about what kind of error occurred, which made returning the
      correct error code after a process failure impossible.
      This patch converts the binary value to an enum with three options:
      The original use of TRUE and false maps to MPIR_ERR_NONE and
      MPIR_ERR_PROC_FAILED indicates that the error occurred
      because of a process failure. It uses the new bit set aside from the tag
      space to track such information between processes.
      This change required modifying lots of function signatures and type
      declarations to use the new enum type, but these are actually not very
      intrusive changes and shouldn't be a problem going forward.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
    • Wesley Bland's avatar
      Take a bit in the tag space for proc failure · 46f59276
      Wesley Bland authored
      We need to take another bit from the tag space to specify the difference
      between a generic failure and a process failure. This patch modifies the
      macros to handle this situation.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
  6. 11 Nov, 2014 1 commit
  7. 06 Nov, 2014 2 commits
    • Wesley Bland's avatar
      Return request from IRECV even if failure · 5b0cfb3b
      Wesley Bland authored
      We will now return a request handle from MPI_IRECV even if there is a
      failure. The reason for this is because the ULFM spec says that even if
      the function returns MPIX_ERR_PROC_FAILED_PENDING, it still should
      provide a valid request that can be completed later.
      This doesn't cause a problem for other situations because the value of
      the request is undefined in that scenario so it's fine for it to be
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
    • Wesley Bland's avatar
      Check for pending any source ops · c2be640e
      Wesley Bland authored
      Before calling the progress engine, make sure none of the operations
      should return an error for MPIX_ERR_PROC_FAILED_PENDING. They would
      cause the progress engine to hang (potentially) so we can't enter it.
      Instead, mark the appropriate error codes and return immediately.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
  8. 05 Nov, 2014 1 commit
  9. 04 Nov, 2014 3 commits
    • Min Si's avatar
      Implement true request-based RMA operations. · 3e005f03
      Min Si authored
      There are two requests associated with each request-based
      operation: one normal internal request (req) and one newly
      added user request (ureq). We return ureq to user when
      request-based op call returns.
      The ureq is initialized with completion counter (CC) to 1
      and ref count to 2 (one is referenced by CH3 and another
      is referenced by user). If the corresponding op can be
      finished immediately in CH3, the runtime will complete ureq
      in CH3, and let user's MPI_Wait/Test to destroy ureq. If
      corresponding op cannot be finished immediately, we will
      first increment ref count to 3 (because now there are
      three places needed to reference ureq: user, CH3,
      progress engine). Progress engine will complete ureq when
      op is completed, then CH3 will release its reference during
      garbage collection, finally user's MPI_Wait/Test will
      destroy ureq.
      The ureq can be completed in following three ways:
      1. If op is issued and completed immediately in CH3
      (req is NULL), we just complete ureq before free op.
      2. If op is issued but not completed, we remember the ureq
      handler in req and specify OnDataAvail / OnFinal handlers
      in req to a newly added request handler, which will complete
      user reqeust. The handler is triggered at three places:
         2-a. when progress engine completes a put/acc req;
         2-b. when get/getacc handler completes a get/getacc req;
         2-c. when progress engine completes a get/getacc req;
      3. If op is not issued (i.e., wait for lock granted), the 2nd
      way will be eventually performed when such op is issued by
      progress engine.
      Signed-off-by: default avatarXin Zhao <xinzhao3@illinois.edu>
    • Junchao Zhang's avatar
      Rename enum MPICH_WITHIN_MPI to MPICH_IN_INIT · 9ea630d0
      Junchao Zhang authored
      The new enum name is more descriptive to describle an MPIR_MPI_State_t
      that says MPICH is in initialization but not completely finished.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
    • Junchao Zhang's avatar
      Make MPI_Initialized and friends thread-safe · 435ce800
      Junchao Zhang authored
      Implements MPI-Forum ticket 357 (https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/357
      The ticket will be included in MPI-3.1, which adds thread-safety to MPI_INITIALIZED,
      In MPICH, we make MPIR_Process.mpich_state atomic. After MPI is fully initialized, i.e.,
      in POST_INIT state, MPI_QUERY_THREAD, MPI_IS_THREAD_MAIN are inherently thread-safe.
      Fixes #2137
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
  10. 03 Nov, 2014 2 commits
    • Xin Zhao's avatar
      Add blocking ops / targets aggressively cleanup functions. · 41a365ec
      Xin Zhao authored
      When we run out of resources for operations and targets,
      we need to make the runtime to complete some operations
      so that it can free some resources.
      For RMA operations, we implement by doing an internal
      FLUSH_LOCAL for one target and waiting for operation
      resources; for RMA targets, we implement by doing an
      internal FLUSH operation for one target and wait for
      target resources.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
    • Xin Zhao's avatar
      Embedding packet structure into RMA operation structure. · b1685139
      Xin Zhao authored and Pavan Balaji's avatar Pavan Balaji committed
      We were duplicating information in the operation structure and in the
      packet structure when the message is actually issued.  Since most of
      the information is the same anyway, this patch just embeds a packet
      structure into the operation structure, so that we eliminate unnessary
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
  11. 28 Oct, 2014 2 commits
    • Paul Coffman's avatar
      Assign large blocks first in ADIOI_GPFS_Calc_file_domains · c16466e3
      Paul Coffman authored and Rob Latham's avatar Rob Latham committed
      For files that are less than the size of a gpfs block there seems to be
      an issue if successive MPI_File_write_at_all are called with proceeding
      offsets.  Given the simple case of 2 aggs, the 2nd agg/fd will be utilized,
      however the initial offset into the 2nd agg is distorted on the 2nd call
      to MPI_File_write_at_all because of the negative size of the 1st agg/fd
      because the offset info the 2nd agg/fd is influenced by the size of the
      first.  Simple solution is to reverse the default large block assignment so
      in the case where only 1 agg/fd will be used it will be the first.  By chance
      in the 2 agg situation this is what the GPFSMPIO_BALANCECONTIG
      optimization does and it does not have this problem.
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
    • Paul Coffman's avatar
      MP_IOTASKLIST error checking · 976272a7
      Paul Coffman authored and Rob Latham's avatar Rob Latham committed
      PE users may manually specify the MP_IOTASKLIST for explicit aggregator
      selection.  Code needed to be added to verify that the user
      specification of aggregators were all valid.
      Do our best to maintain the old PE behavior of using as much of the
      correctly specified MP_IOTASKLIST as possible and issuing what it
      labeled error messages but were really warnings about the incorrect
      portions and functionally just ignoring it, unless none of it was usable
      in which case it fell back on the default.
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
  12. 24 Oct, 2014 1 commit
  13. 23 Oct, 2014 2 commits
    • Wesley Bland's avatar
      Fix typo in 72513b14 · 4b20f2d2
      Wesley Bland authored
    • Wesley Bland's avatar
      Remove _FT from state names · 72513b14
      Wesley Bland authored
      Back in the 3.1 series, we made the FT versions of all of the MPIC functions
      default. However, we never changed the names of all of the states. This
      removes the extra state names.
      No reviewer.
  14. 22 Oct, 2014 1 commit
    • Wesley Bland's avatar
      Fix typo in bcast macro · e49213d6
      Wesley Bland authored
      The macro that called the bcast function left out an underscore in the
      mpi_errno return value. This caused the test to always return MPI_ERR_OTHER
      instead of the value being returned by the underlying bcast function.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
  15. 21 Oct, 2014 1 commit
  16. 20 Oct, 2014 4 commits
  17. 17 Oct, 2014 1 commit
  18. 14 Oct, 2014 1 commit
  19. 10 Oct, 2014 1 commit
    • Igor Ivanov's avatar
      mpi/coll: Fix incorrect parameter check · cc4b0952
      Igor Ivanov authored and Pavan Balaji's avatar Pavan Balaji committed
      Fixed wrong parameter check condition for MPI_Iallgather and MPI_Iallgatherv
      -1 is valid value for sendcount in case MPI_IN_PLACE
      MPI spec says:
      The in place option for intracommunicators is specified by passing the value
      MPI_IN_PLACE to the argument sendbuf at all processes. In such a case, sendcount and
      sendtype are ignored, and the input data of each process is assumed to be in the area where
      that process would receive its own contribution to the receive buffer.
      Signed-off-by: default avatarIgor Ivanov <Igor.Ivanov@itseez.com>
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
  20. 03 Oct, 2014 1 commit
    • Igor Ivanov's avatar
      mpi/comm: Fix MPI_Intercomm_merge · 051449e7
      Igor Ivanov authored and Pavan Balaji's avatar Pavan Balaji committed
      Previous code is based on a trick to get true context id using
      invalid communicator. There is a call of MPIR_Get_contextid()
      for new communicator that is not completelly built (no comm_commit
      and comm_hook calls).
      netmod/mxm uses comm_hook and can not work with this trick.
      This changes allow to avoid call of invalid communicator using
      temporary (intermediate) communicator.
      Signed-off-by: default avatarIgor Ivanov <Igor.Ivanov@itseez.com>
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@anl.gov>
  21. 01 Oct, 2014 2 commits
  22. 26 Sep, 2014 1 commit
  23. 18 Sep, 2014 1 commit
  24. 16 Sep, 2014 2 commits
    • Rob Latham's avatar
      Remove comm_split in deferred open case · 6b9e0dc0
      Rob Latham authored
      comm_split might scale poorly on some systems, and we don't even use the
      resulting communicator.  it was used as a marker, but we have enough
      other information.
    • Rob Latham's avatar
      fix logic in syshint processing check · 8da6ae0a
      Rob Latham authored
      Ken was right: I was being too clever before.  now, simply and more
      explicitly check if anyone has a NULL systemwide-hint info object.