1. 30 Jan, 2014 1 commit
    • Pavan Balaji's avatar
      Move communicator destruction to after progress checks. · 29711bc6
      Pavan Balaji authored
      
      
      During finalize, we were destroying the COMM_WORLD, COMM_SELF and
      COMM_IWORLD communicator objects, and all other associated resources
      internally, before waiting for the final progress checks for incoming
      messages finished.  This resulted in the following sequence of cleanup:
      
      1. COMM_WORLD got cleaned up.  Internally, there is a check to see if
      a group object has been allocated for COMM_WORLD.  If there is one, it
      is freed up.
      
      2. We waited for other messages to arrive.  We noticed a failure at
      this time, so we try to create a failed process group.  This uses the
      COMM_WORLD group internally, causing it to be created again, but with
      a reference count of 2, since the code assumes that the first
      reference count is always for the original COMM_WORLD.
      
      3. When we try to free the world group, we notice that the reference
      count is 2, so we decrement the reference count and not actually free
      the object.
      
      Moving the check for incoming messages to happen before the
      communicator free, fixes this problem.  Note that the PG finalization
      still needs to be the last step since that cleans up all the VCs as
      well.
      
      See #1996
      
      Signed-off-by: default avatarWesley Bland <wbland@mcs.anl.gov>
      29711bc6
  2. 29 Jan, 2014 8 commits
    • Pavan Balaji's avatar
      Bug-fix: deal with wrap around PMI process mapping strings · f46354ac
      Pavan Balaji authored
      
      
      If the PMI process mapping string wraps around to node 0, we were
      creating a bad node list of which processes are local and which are
      not.  This patch provides a hacky fix for this case by only repeating
      the part of the PMI mapping string from the point where it wrapped
      around.
      
      The patch is hacky because it assumes that seeing a start node ID of 0
      means a wrap around.  This is not necessarily true.  A user-defined
      node list can use the node ID 0 without actually creating a wrap
      around.  The reason this patch still works in this case is because
      Hydra creates a new node list starting from node ID 0 for
      user-specified nodes during MPI_Comm_spawn{_multiple}.  If a different
      process manager searches for allocated nodes in the user-specified
      list, this patch will break.
      
      Fixes #2007.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      f46354ac
    • Kenneth Raffenetti's avatar
      handle when qsort is not available in vc setup · 7a9210ff
      Kenneth Raffenetti authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      When qsort is not available, don't define comparision function and
      fallback to simple insertion sort implementation. In the future, a
      more general function with fallback should be added in MPL so it
      can be used in other cases like comm_split.
      
      Refs #2007
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      7a9210ff
    • Pavan Balaji's avatar
      Improve PMI_process_mapping parsing. · 687dd1dc
      Pavan Balaji authored
      
      
      The original PMI process mapping parsing code had a number of
      assumptions that would allow it to only work on COMM_WORLD.  This
      patch corrects these to work for dynamic processes as well.
      
      It also corrects the evaluation of the number of nodes used to be
      correct in the general case.
      
      Refs #2007.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      687dd1dc
    • Pavan Balaji's avatar
      Remove unused flag. · 0f0f20a6
      Pavan Balaji authored
      
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      0f0f20a6
    • Pavan Balaji's avatar
      Revert "Move communicator destruction to after progress checks." · b270ae2b
      Pavan Balaji authored
      This reverts commit 058c8bf0.
      
      Refs #1996.
      b270ae2b
    • Wesley Bland's avatar
      Add configure option to disable FT tests · 696f9fa8
      Wesley Bland authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      Some netmods can't handle FT right now. To allow the test suite to work
      properly on those netmods, this adds an option for those tests to be disabled
      at configure time using the flag --disable-ft-tests.
      
      Fixes #2005
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      696f9fa8
    • Pavan Balaji's avatar
      Move communicator destruction to after progress checks. · 058c8bf0
      Pavan Balaji authored
      
      
      During finalize, we were destroying the COMM_WORLD, COMM_SELF and
      COMM_IWORLD communicator objects, and all other associated resources
      internally, before waiting for the final progress checks for incoming
      messages finished.  This resulted in the following sequence of cleanup:
      
      1. COMM_WORLD got cleaned up.  Internally, there is a check to see if
      a group object has been allocated for COMM_WORLD.  If there is one, it
      is freed up.
      
      2. We waited for other messages to arrive.  We noticed a failure at
      this time, so we try to create a failed process group.  This uses the
      COMM_WORLD group internally, causing it to be created again, but with
      a reference count of 2, since the code assumes that the first
      reference count is always for the original COMM_WORLD.
      
      3. When we try to free the world group, we notice that the reference
      count is 2, so we decrement the reference count and not actually free
      the object.
      
      Moving the check for incoming messages to happen before the
      communicator free fixes this problem.
      
      See #1996
      
      Signed-off-by: default avatarWesley Bland <wbland@mcs.anl.gov>
      058c8bf0
    • Kenneth Raffenetti's avatar
      mark cas_type_check test as xfail · a6652e0f
      Kenneth Raffenetti authored
      No reviewer.
      a6652e0f
  3. 27 Jan, 2014 4 commits
  4. 26 Jan, 2014 6 commits
  5. 24 Jan, 2014 1 commit
  6. 21 Jan, 2014 3 commits
  7. 20 Jan, 2014 1 commit
    • Rob Latham's avatar
      Revert "a partial round of datatype optimizations" · 7a9a46fb
      Rob Latham authored
      This reverts commit 38ef5818.
      
      the MPICH-1 and Intel tests found unexpected results with these
      optimizations.  Will explore later.
      
      Conflicts:
      	src/mpid/common/datatype/dataloop/dataloop_optimize.c
      	src/mpid/common/datatype/mpid_type_debug.c
      7a9a46fb
  8. 19 Jan, 2014 4 commits
  9. 18 Jan, 2014 4 commits
    • Pavan Balaji's avatar
      Time iterations and break out if we are too slow. · 29c8529f
      Pavan Balaji authored
      
      
      On some machines the iterations take unusually long.  If they are
      getting to be larger than a predefined amount, break out of that loop.
      
      Fixes #1669.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      29c8529f
    • Pavan Balaji's avatar
      White space cleanup. · 57dc1401
      Pavan Balaji authored
      The code was unparseable to make any changes.
      57dc1401
    • Kenneth Raffenetti's avatar
      compile wrapper cleanup · eb42f624
      Kenneth Raffenetti authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      Simplify logic in compile wrapper scripts. Use configure substitutions
      where possible to better match pkg-config style.
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      
      Includes the following modifications by Pavan Balaji:
      
      Remove the PAC_COMPILER_SHLIB_FLAGS usage, instead of modifying the
      macro in confdb.
      
      The ordering of flags in mpicc and friends does not match that of
      pkg-config.  This is because of two reasons.
      
      1. pkg-config reorders flags when it outputs them.  This requires us
      to manually adjust the flags in mpicc to match up, and is error prone.
      
      2. mpicc and friends provide LDFLAGS before the user-specified flags,
      followed by the include and library directories.  This is to make sure
      that the LDFLAGS are listed before the application source file.
      Reordering them to match pkg-config loses this flexibility.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      eb42f624
    • Kenneth Raffenetti's avatar
      Improve pkg-config support · be278b7c
      Kenneth Raffenetti authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      Add rpath flags to pkg-config to match compiler wrappers. Fixes #1044
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      be278b7c
  10. 16 Jan, 2014 3 commits
    • Rob Latham's avatar
      a partial round of datatype optimizations · 38ef5818
      Rob Latham authored
      
      
      Some datatype performance tests in the MPICH test suite fail:
      (perf/twovec,  perf/nestvec, perf/nestvec2, perf/indexperf,
      perf/transp-datatype).
      
      This changeset introduces a few optimizations that operate on the
      dataloop representation to make it more performant.  perf/indexperf
      should still fail under these changes.
      
      Original-author: Bill Gropp <wgropp@illinois.edu>
      
      See #1788, for which this resolves some but not all performance issues.
      
      Signed-off-by: Rob Latham's avatarRob Latham <robl@mcs.anl.gov>
      38ef5818
    • William Gropp's avatar
      Fix bogus datatype perf test · 4e1b470d
      William Gropp authored
      The test in test/mpi/perf/twovec made invalid assumptions about the
      performance of two MPI datatype creation routines.  This is a hard test to
      get right, but this version is more likely to avoid falsely signalling
      an error.
      4e1b470d
    • Pavan Balaji's avatar
      Comment out nb_test, since it's not entirely correct. · cf551af4
      Pavan Balaji authored
      
      
      This was meant to test out the case when MPI_Test is not nonblocking.
      However, we ended up assuming that MPI_Win_lock will be nonblocking.
      That is not specified by the standard and might not be true.
      Commenting this out till be find a better way to test the original
      problem with MPI_Test.
      
      Fixes #1910.
      
      Signed-off-by: Rajeev Thakur's avatarRajeev Thakur <thakur@mcs.anl.gov>
      cf551af4
  11. 15 Jan, 2014 5 commits