1. 22 May, 2014 1 commit
    • Wesley Bland's avatar
      Make handling of request cleanup more uniform · 1e171ff6
      Wesley Bland authored
      
      
      There are quite a few places where the request cleanup is done via:
      
      MPIU_Object_set_ref(req, 0);
      MPIDI_CH3_Request_destroy(req);
      
      when it should be:
      
      MPID_Request_release(req);
      
      This makes the handling more uniform so requests are cleaned up by releasing
      references rather than hitting them with the destroy hammer.
      
      Fixes #1664
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      1e171ff6
  2. 23 Apr, 2014 1 commit
  3. 11 Apr, 2014 2 commits
  4. 02 Apr, 2014 1 commit
  5. 01 Apr, 2014 2 commits
    • Pavan Balaji's avatar
      Move symbols to correct libraries. · 9c337914
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      Maintain a list of files that go into each library.  If a particular
      binding is not enabled, the list variable still exists, but will just
      be empty.  This simplifies the management of which files/symbols go
      into which library.
      
      Move all MPI_ symbols to the libmpi library and all other symbols to
      the libpmpi library.  All Fortran 77 symbols go into libmpif77.so,
      while C symbols go into libmpi.so.  There are some exceptions, such as
      status_f2c, which are handled by the Fortran code but used in C.  Our
      Fortran 90 build only creates a few symbols and uses the f77 symbols
      for everything else.  These few symbols go into libmpifort.so.
      
      Also update compiler wrappers to link to correct libraries.  mpif77
      should now link with libmpif77.  mpif90 links with both libmpifort and
      libmpif77, since our F90 build still keeps the core Fortran library
      symbols in libf77.
      
      We completely ignored the F77 library earlier.  This was OK because
      all of the Fortran symbols were ending up in libmpi.  Now that we have
      separated out the symbols to the right library, we now need to link to
      libmpif77 as well.
      
      Also added inter-library dependencies.
      
      libmpi has a dependency on several internal libraries: libmpl, libopa.
      libmpicxx did not have a dependency on libmpi, added.
      libmpif77 did not have a dependency on libmpi, added.
      libmpifort did not have a dependency on libmpi, added.
      
      This dependency model is sufficient for C and F77, but not for C++ and
      F90.  The C and F77 libraries contain all the symbols the application
      relies on, but the F90 and C++ libraries don't.  In the case of F90,
      symbols such as mpi_bcast are missing and are borrowed from the F77
      library.  In the case of C++, mpicxx.h contains calls directly to C
      functions (such as MPI_Reduce_local), which get embedded into the
      application.
      
      Fixes #2023.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      9c337914
    • Pavan Balaji's avatar
      Rename mpich libraries. · 42fe2ccf
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      The following library names are used to make the naming consistent
      across the ABI compatibility group:
      
      C libraries: libmpi.* and libpmpi.*
      C++ library: libmpicxx.*
      F77 libraries: libmpif77.*
      F90+ library: libmpifort.*
      
      This patch also gets rid of the FWRAPNAME variable, which is a
      duplicate of MPIFLIBNAME.  Similarly, FCWRAPNAME is removed and a new
      variable MPIFCLIBNAME is added, so it's consistent with the other
      names.
      
      PMPIFLIBNAME, which was unused, is no longer present.
      
      Fixes #2039.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      42fe2ccf
  6. 23 Mar, 2014 1 commit
    • Wesley Bland's avatar
      Remove the use of MPIDI_TAG_UB · 055abbd3
      Wesley Bland authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      The constant MPIDI_TAG_UB is used in only one place at the moment, in the
      initialization of ch3 (source:src/mpid/ch3/src/mpid_init.c@4b35902a#L131). The
      problem is that the value which is being set (MPIR_Process.attrs.tag_ub) is
      set differently in pamid (INT_MAX). This leads to weird results when we set
      apart a bit in the tag space for failure propagation in non-blocking
      collectives (see #2008).
      
      Since this value isn't being referenced anywhere else, there doesn't seem to
      be a use for it and it's just leading to confusion. To avoid this, here we
      remove this value and just set MPIR_Process.attrs.tag_ub to INT_MAX in both
      ch3 and pamid.
      
      See #2009
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      055abbd3
  7. 13 Mar, 2014 1 commit
    • Huiwei Lu's avatar
      Fixes inconsistent definition of parameters · 33337436
      Huiwei Lu authored
      
      
      In MPID_Win_allocate and MPID_Win_allocate_shared, baseptr are defined
      as void * and void ** separately, while in MPIDI_Win_fns, both
      MPID_Win_allocate and MPID_Win_allocate_shared are registered as
      MPIDI_CH3U_Win_allocate, where baseptr is defined as void *.
      
      Fixes #1995
      Signed-off-by: default avatarJunchao Zhang <jczhang@mcs.anl.gov>
      33337436
  8. 09 Mar, 2014 1 commit
  9. 08 Mar, 2014 1 commit
  10. 26 Feb, 2014 1 commit
  11. 25 Feb, 2014 1 commit
  12. 10 Feb, 2014 1 commit
  13. 09 Feb, 2014 2 commits
  14. 01 Feb, 2014 1 commit
  15. 31 Jan, 2014 1 commit
  16. 30 Jan, 2014 1 commit
    • Pavan Balaji's avatar
      Move communicator destruction to after progress checks. · 29711bc6
      Pavan Balaji authored
      
      
      During finalize, we were destroying the COMM_WORLD, COMM_SELF and
      COMM_IWORLD communicator objects, and all other associated resources
      internally, before waiting for the final progress checks for incoming
      messages finished.  This resulted in the following sequence of cleanup:
      
      1. COMM_WORLD got cleaned up.  Internally, there is a check to see if
      a group object has been allocated for COMM_WORLD.  If there is one, it
      is freed up.
      
      2. We waited for other messages to arrive.  We noticed a failure at
      this time, so we try to create a failed process group.  This uses the
      COMM_WORLD group internally, causing it to be created again, but with
      a reference count of 2, since the code assumes that the first
      reference count is always for the original COMM_WORLD.
      
      3. When we try to free the world group, we notice that the reference
      count is 2, so we decrement the reference count and not actually free
      the object.
      
      Moving the check for incoming messages to happen before the
      communicator free, fixes this problem.  Note that the PG finalization
      still needs to be the last step since that cleans up all the VCs as
      well.
      
      See #1996
      Signed-off-by: default avatarWesley Bland <wbland@mcs.anl.gov>
      29711bc6
  17. 29 Jan, 2014 5 commits
    • Pavan Balaji's avatar
      Bug-fix: deal with wrap around PMI process mapping strings · f46354ac
      Pavan Balaji authored
      
      
      If the PMI process mapping string wraps around to node 0, we were
      creating a bad node list of which processes are local and which are
      not.  This patch provides a hacky fix for this case by only repeating
      the part of the PMI mapping string from the point where it wrapped
      around.
      
      The patch is hacky because it assumes that seeing a start node ID of 0
      means a wrap around.  This is not necessarily true.  A user-defined
      node list can use the node ID 0 without actually creating a wrap
      around.  The reason this patch still works in this case is because
      Hydra creates a new node list starting from node ID 0 for
      user-specified nodes during MPI_Comm_spawn{_multiple}.  If a different
      process manager searches for allocated nodes in the user-specified
      list, this patch will break.
      
      Fixes #2007.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      f46354ac
    • Kenneth Raffenetti's avatar
      handle when qsort is not available in vc setup · 7a9210ff
      Kenneth Raffenetti authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      When qsort is not available, don't define comparision function and
      fallback to simple insertion sort implementation. In the future, a
      more general function with fallback should be added in MPL so it
      can be used in other cases like comm_split.
      
      Refs #2007
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      7a9210ff
    • Pavan Balaji's avatar
      Improve PMI_process_mapping parsing. · 687dd1dc
      Pavan Balaji authored
      
      
      The original PMI process mapping parsing code had a number of
      assumptions that would allow it to only work on COMM_WORLD.  This
      patch corrects these to work for dynamic processes as well.
      
      It also corrects the evaluation of the number of nodes used to be
      correct in the general case.
      
      Refs #2007.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      687dd1dc
    • Pavan Balaji's avatar
      Revert "Move communicator destruction to after progress checks." · b270ae2b
      Pavan Balaji authored
      This reverts commit 058c8bf0.
      
      Refs #1996.
      b270ae2b
    • Pavan Balaji's avatar
      Move communicator destruction to after progress checks. · 058c8bf0
      Pavan Balaji authored
      
      
      During finalize, we were destroying the COMM_WORLD, COMM_SELF and
      COMM_IWORLD communicator objects, and all other associated resources
      internally, before waiting for the final progress checks for incoming
      messages finished.  This resulted in the following sequence of cleanup:
      
      1. COMM_WORLD got cleaned up.  Internally, there is a check to see if
      a group object has been allocated for COMM_WORLD.  If there is one, it
      is freed up.
      
      2. We waited for other messages to arrive.  We noticed a failure at
      this time, so we try to create a failed process group.  This uses the
      COMM_WORLD group internally, causing it to be created again, but with
      a reference count of 2, since the code assumes that the first
      reference count is always for the original COMM_WORLD.
      
      3. When we try to free the world group, we notice that the reference
      count is 2, so we decrement the reference count and not actually free
      the object.
      
      Moving the check for incoming messages to happen before the
      communicator free fixes this problem.
      
      See #1996
      Signed-off-by: default avatarWesley Bland <wbland@mcs.anl.gov>
      058c8bf0
  18. 27 Jan, 2014 2 commits
    • Wesley Bland's avatar
      Remove a comment that doesn't apply anymore. · 201b0dbf
      Wesley Bland authored
      No reviewer
      201b0dbf
    • Wesley Bland's avatar
      Moves the tag reservation to MPI layer · bb755b5c
      Wesley Bland authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      Resets MPIDI_TAG_UB back to 0x7fffffff. This value was changed a while back,
      but the change should have happened at the MPI layer instead of the CH3 layer.
      This resets the value to allow CH3 to use the tag space.
      
      Instead, the value is now set in the MPI layer during initthread. This means
      that it will be safe regardless of the device being used. This prevents a
      collision that was occurring on the pamid device where the values for
      MPIR_TAG_ERROR_BIT and the MPIR_Process.attr.tagged_coll_mask values were the
      same.
      
      Fixes #2008
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      bb755b5c
  19. 19 Jan, 2014 2 commits
  20. 10 Jan, 2014 1 commit
  21. 02 Jan, 2014 1 commit
  22. 30 Dec, 2013 3 commits
    • Antonio J. Pena's avatar
      Fix compiler warnings about variables for assert · 8ea1dfec
      Antonio J. Pena authored
      
      
      Mark those variables exclusively being used for assertions as potentially
      unused to avoid compiler warnings when the assertion macro does nothing. These
      show up with --enable-fast.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      8ea1dfec
    • Antonio J. Pena's avatar
      Fix compiler warning in socksm.c · 2225f092
      Antonio J. Pena authored
      
      
      Fixes the following warnings when compiling socksm.c with --enable-strict:
      
      src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c: In function
      'alloc_sc_plfd_tbls':
      
      src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c:183:5: warning: the
      comparison will always evaluate as 'true' for the address of
      'MPID_nem_tcp_g_lstn_sc' will never be NULL [-Waddress]
      
      src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c:184:5: warning: the
      comparison will always evaluate as 'true' for the address of
      'MPID_nem_tcp_g_lstn_plfd' will never be NULL [-Waddress]
      
      Reported in ticket #1966
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      2225f092
    • Antonio J. Pena's avatar
      Fix warnings in ch3u_rma_acc_ops and ch3u_rma_ops · 583e3f0a
      Antonio J. Pena authored
      
      
      Fixes the following warnings (with --enable-strict):
      
      src/mpid/ch3/src/ch3u_rma_acc_ops.c: In function 'MPIDI_Get_accumulate':
      src/mpid/ch3/src/ch3u_rma_acc_ops.c:31:5: warning: unused variable
      'mpiu_chklmem_stk_sz_' [-Wunused-variable]
      
      src/mpid/ch3/src/ch3u_rma_ops.c: In function 'MPIDI_Accumulate':
      src/mpid/ch3/src/ch3u_rma_ops.c:350:5: warning: unused variable
      'mpiu_chklmem_stk_sz_' [-Wunused-variable]
      
      See ticket #1966
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      583e3f0a
  23. 19 Dec, 2013 3 commits
  24. 18 Dec, 2013 4 commits