1. 24 Mar, 2014 6 commits
  2. 23 Mar, 2014 1 commit
    • Wesley Bland's avatar
      Remove the use of MPIDI_TAG_UB · 055abbd3
      Wesley Bland authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      The constant MPIDI_TAG_UB is used in only one place at the moment, in the
      initialization of ch3 (source:src/mpid/ch3/src/mpid_init.c@4b35902a#L131). The
      problem is that the value which is being set (MPIR_Process.attrs.tag_ub) is
      set differently in pamid (INT_MAX). This leads to weird results when we set
      apart a bit in the tag space for failure propagation in non-blocking
      collectives (see #2008).
      
      Since this value isn't being referenced anywhere else, there doesn't seem to
      be a use for it and it's just leading to confusion. To avoid this, here we
      remove this value and just set MPIR_Process.attrs.tag_ub to INT_MAX in both
      ch3 and pamid.
      
      See #2009
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      055abbd3
  3. 22 Mar, 2014 1 commit
  4. 21 Mar, 2014 1 commit
    • Junchao Zhang's avatar
      Rename Fortran binding directories. · 134f47a2
      Junchao Zhang authored and Pavan Balaji's avatar Pavan Balaji committed
      
      
      f77 and f90 were not consistent names with respect to what the code
      was providing.  For example, "f90" also used code from "f77".  This
      patch changes the naming to "mpif_h" and "use_mpi", which correspond
      to using the mpif.h and "use mpi module" conventions specified by the
      MPI standard.
      
      This also includes changes to buildiface, autogen.sh and .gitignore to
      work with these changes.
      
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
      134f47a2
  5. 20 Mar, 2014 1 commit
  6. 19 Mar, 2014 1 commit
    • Su Huang's avatar
      MPICH test case linked_list_lockall hang in MPI_Win_flush · 4b35902a
      Su Huang authored
      
      
      The scenario of the hang is described as follows:
      
        Assuming the job runs with 4 tasks, task 0 is in a loop  of processing the
        following RMA operations to fetch the displacement, the loop ends if the
        displacement is being updated.
      
          MPI_Win_get_accumulate( target rank is task 0)
          MPI_Win_flush(task 0)
      
        task 1 and 3 hang in MPI_Win_flush() waiting for a call to
        MPI_Win_compare_and_swap() to complete. The target rank for this operation is
        task 0.
      
        task 2 hangs in MPI_Win_flush() waiting for a call to MPI_Accumulate() to
        complete. The target rank for this operation is task 0 as well.
      
        Task 0 is busy making MPI_Win_get_accumulate() and MPI_Win_flush() calls to
        see if the displacement is being updated, the target rank of the operation is
        task 0 itself which means the operation is local and can be completed without
        a need of making a PAMI dispatcher call.  Meanwhile, the other three tasks
        issue RMA operations to the target task 0 and wait for the completion of the
        operations. Because task 0 is in a loop of making local operations, no PAMI
        dispatcher is called, no progress made for any remote operations which is the
        root cause of the hang.
      
      The fix for the problem is to add a call to PAMI dispatcher in MPI_Win_flush(),
      the call is made prior to the check of the condition. Current code checks the
      condition first, if the condition is satisfied, then no PAMI dispatcher is called.
      
      The following statement in MPI_Win_flush()
      
        MPID_PROGRESS_WAIT_WHILE(sync->total != sync->complete)
      
      will be replaced by
      
        MPID_PROGRESS_WAIT_DO_WHILE(sync->total != sync->complete)
      
      (ibm) D196445
      
      Signed-off-by: default avatarMichael Blocksome <blocksom@us.ibm.com>
      4b35902a
  7. 18 Mar, 2014 1 commit
  8. 17 Mar, 2014 7 commits
    • William Gropp's avatar
      Add style comments for coding style checker · cfe6626b
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      These file had allowed uses of routines that aren't always permitted
      according to our coding style.  These edits let the coding style
      checker know that these uses are ok (most uses of printf or malloc/free
      in MPICH are incorrect - see the style guide).
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      cfe6626b
    • William Gropp's avatar
      Use MPIR_MIN/MPIR_MAX instead of MIN/MAX · c6f93012
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      The coding standards are clear on this; since we can't trust the system
      header files not to make definitions that conflict with ours, good
      coding practice is to make all MPICH names distinct.  The MPIR_MIN
      and MPIR_MAX definitions were made long ago for just this purpose.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      c6f93012
    • William Gropp's avatar
      Add style commands to generated files · 4db583ed
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      The coding checker sometimes needs some hints that certain uses are
      permitted.  This adds the necessary information to some of the generated
      files.
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      4db583ed
    • Antonio J. Pena's avatar
      Fixed configure when disabling f77/fc · d4e30cc0
      Antonio J. Pena authored
      
      
      Fixes #2048
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      d4e30cc0
    • Rob Latham's avatar
      some lustre parameters larger than signed int · c8a19a9f
      Rob Latham authored
      
      
      Discussion with Michael Raymond: a customer would "like to use larger
      than 32-bit stripe units".  Technically, lustre only supports a 32 bit
      value for its stripe units -- an *unsigned* 32 bit value.
      
      Since we're here, let's double check that we do not try to jam a larger
      value into the lustre hint structure than lustre can support.
      
      Signed-off-by: default avatarMichael Raymond <mraymond@sgi.com>
      c8a19a9f
    • Junchao Zhang's avatar
      Fix weak symbol redefinition error with Pathscale · 486cac76
      Junchao Zhang authored
      
      
      Pathscale C compiler reports symbol redefinition error for mpi_conversion_fn_null__ in such code
      
      extern int mpi_conversion_fn_null__ ( void*v1, MPI_Fint*v2, MPI_Fint*v3, void*v4, MPI_Offset*v5, MPI_Fint *v6, MPI_Fint*v7, MPI_Fint *ierr ) __attribute__((weak,alias("mpi_conversion_fn_null_")));
      \#pragma weak mpi_conversion_fn_null__ = mpi_conversion_fn_null_
      
      We should only generate one form code using either "__attribute__" or "#pragma weak"
      
      Fixes #2044
      
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      486cac76
    • Huiwei Lu's avatar
      Fixes test scripts in test/mpi · fcbf5261
      Huiwei Lu authored
      Supports 'make dist', 'make -j' and others.
      
      1. Several tests has been added to test/mpi since the ticket 1720 was
      created. These new tests are added to corresponding Makefile.am.
      
      2. There is one minor thing I am not sure in this fix. I don't know the
      different of noinst_HEADERS and nodist_noinst_HEADERS, so I just put all
      head files under noinst_HEADERS (in test/mpi/Makefile.am).
      
      3. Windows project files like adapt.vcproj were deleted in commit
      ba3badff
      
      , they should be removed in
      test/mpi/basic/Makefile.am too.
      
      4. MPICH ABI removed support for MPI::Distgraphcomm, so distgraphcxx.cxx
      is changed to be not compiled in cxx/topo.
      
      5. One minor change in help description in test/mpi/runtests.in. So when you
      run './runtests -batch', you will get help message directing you to
      README.
      
      Fixes #1720
      
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      fcbf5261
  9. 15 Mar, 2014 1 commit
  10. 14 Mar, 2014 2 commits
  11. 13 Mar, 2014 4 commits
  12. 11 Mar, 2014 1 commit
  13. 10 Mar, 2014 11 commits
  14. 09 Mar, 2014 1 commit
  15. 08 Mar, 2014 1 commit