1. 26 Mar, 2014 9 commits
    • Rob Latham's avatar
      Allocate two-phase buffer outside write path · 5e34974e
      Rob Latham authored
      There are many memory allocations in the write path.  Allocating the
      two-phase intermediate buffer outside of the write path might on some
      systems make a small difference, especially if there are many collective
      I/O calls, or if the system (like Blue Gene) has a small amount of
      memory.  Modified from Paul Coffman <pkcoff@us.ibm.com>'s original idea.
    • Rob Latham's avatar
      remove uneeded barrier · 6ca13e5d
      Rob Latham authored
      For quite some time the barrier here has had the comment 'Why?'.  Since
      no one knows, and there are plenty of other syncronization points in
      this path, remove it.
    • Rob Latham's avatar
      bluegene timing: condense into one set of timers · f3a43a5a
      Rob Latham authored
      bluegene timer code had two "levels" of timing.  that seemed kind of
      pointless so lump it all into one level.
    • Rob Latham's avatar
      use pwrite/pread instead of seek+write/read · 5bc8aedc
      Rob Latham authored
      this "new" system call (part of POSIX-2001) saves us a system call on
      Blue Gene.  Seems to get us back 5 seconds for one workload at small
      (half rack) scales.
    • Rob Latham's avatar
      bg-timing: DO NOT MERGE WITH MASTER: time lockless · c97af627
      Rob Latham authored
      bglockles uses the common read/write routines for contig read/wrties, so
      bluegene timing infrastrucutre wasn't actually timing anything.  Since
      this introduces blue gene bits into common code, please do not merge to
      master.  Instead, we should rework all the timing bits so that it no
      longer times "bluegene" but rather all of ROMIO.  Furthermore, the
      locky bits of 'bg:' driver should be yanked anyway, obviating the need
      for bglockless.
    • Rob Latham's avatar
      dust off old Blue Gene timing infrastrucutre · 751176bc
      Rob Latham authored
      Protected by an 'ifdef', this BGL-era code bitrotted a bit.  clean it up
      and see if it does anything useful today.
      - Removes preprocessor guards: the counters and timers do nothing
        expensive unless environment variables are set
      - remove the idea of a "level"
      - remove barrier from timing collection.
      - bugfix: MPI_Wtime() does not necessarily start at zero, so properly initialze
        timers for collective read/write
      - report only from I/O aggregators.  when reporting "time spent in i/o"
        vs "time spent communicating" it makes more sense to look only at the
        aggregators.  The non-aggregators are going to skew the results
        because they are spending some communication time actually
        communicating, but some of that time blocked, waiting for aggregators
        to finish.
    • Kenneth Raffenetti's avatar
      avoid duplicate data in MPIR_proctable · a4b73a8e
      Kenneth Raffenetti authored and Pavan Balaji's avatar Pavan Balaji committed
      De-dupes executable and host names in the MPIR_proctable by pointing
      to an existing copy. Closes #1821
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
    • Pavan Balaji's avatar
      Revert aclocal_cc.m4 commits that tweak VA_ARGS. · 5108fe17
      Pavan Balaji authored
      The following commits are reverted.
      1. "Better checks for VA_ARGS."; commit
      2. "Warning squash for clang."; commit
      The clang warning this was originally trying to solve has been fixed
      by the newer versions of clang, AFAICT.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
    • Pavan Balaji's avatar
      Remove more windows files. · 24a01405
      Pavan Balaji authored
      No reviewer.
  2. 25 Mar, 2014 1 commit
  3. 24 Mar, 2014 10 commits
  4. 23 Mar, 2014 1 commit
    • Wesley Bland's avatar
      Remove the use of MPIDI_TAG_UB · 055abbd3
      Wesley Bland authored and Pavan Balaji's avatar Pavan Balaji committed
      The constant MPIDI_TAG_UB is used in only one place at the moment, in the
      initialization of ch3 (source:src/mpid/ch3/src/mpid_init.c@4b35902a#L131). The
      problem is that the value which is being set (MPIR_Process.attrs.tag_ub) is
      set differently in pamid (INT_MAX). This leads to weird results when we set
      apart a bit in the tag space for failure propagation in non-blocking
      collectives (see #2008).
      Since this value isn't being referenced anywhere else, there doesn't seem to
      be a use for it and it's just leading to confusion. To avoid this, here we
      remove this value and just set MPIR_Process.attrs.tag_ub to INT_MAX in both
      ch3 and pamid.
      See #2009
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
  5. 22 Mar, 2014 1 commit
  6. 21 Mar, 2014 1 commit
    • Junchao Zhang's avatar
      Rename Fortran binding directories. · 134f47a2
      Junchao Zhang authored and Pavan Balaji's avatar Pavan Balaji committed
      f77 and f90 were not consistent names with respect to what the code
      was providing.  For example, "f90" also used code from "f77".  This
      patch changes the naming to "mpif_h" and "use_mpi", which correspond
      to using the mpif.h and "use mpi module" conventions specified by the
      MPI standard.
      This also includes changes to buildiface, autogen.sh and .gitignore to
      work with these changes.
      Signed-off-by: Pavan Balaji's avatarPavan Balaji <balaji@mcs.anl.gov>
  7. 20 Mar, 2014 1 commit
  8. 19 Mar, 2014 1 commit
    • Su Huang's avatar
      MPICH test case linked_list_lockall hang in MPI_Win_flush · 4b35902a
      Su Huang authored
      The scenario of the hang is described as follows:
        Assuming the job runs with 4 tasks, task 0 is in a loop  of processing the
        following RMA operations to fetch the displacement, the loop ends if the
        displacement is being updated.
          MPI_Win_get_accumulate( target rank is task 0)
          MPI_Win_flush(task 0)
        task 1 and 3 hang in MPI_Win_flush() waiting for a call to
        MPI_Win_compare_and_swap() to complete. The target rank for this operation is
        task 0.
        task 2 hangs in MPI_Win_flush() waiting for a call to MPI_Accumulate() to
        complete. The target rank for this operation is task 0 as well.
        Task 0 is busy making MPI_Win_get_accumulate() and MPI_Win_flush() calls to
        see if the displacement is being updated, the target rank of the operation is
        task 0 itself which means the operation is local and can be completed without
        a need of making a PAMI dispatcher call.  Meanwhile, the other three tasks
        issue RMA operations to the target task 0 and wait for the completion of the
        operations. Because task 0 is in a loop of making local operations, no PAMI
        dispatcher is called, no progress made for any remote operations which is the
        root cause of the hang.
      The fix for the problem is to add a call to PAMI dispatcher in MPI_Win_flush(),
      the call is made prior to the check of the condition. Current code checks the
      condition first, if the condition is satisfied, then no PAMI dispatcher is called.
      The following statement in MPI_Win_flush()
        MPID_PROGRESS_WAIT_WHILE(sync->total != sync->complete)
      will be replaced by
        MPID_PROGRESS_WAIT_DO_WHILE(sync->total != sync->complete)
      (ibm) D196445
      Signed-off-by: default avatarMichael Blocksome <blocksom@us.ibm.com>
  9. 18 Mar, 2014 1 commit
  10. 17 Mar, 2014 7 commits
    • William Gropp's avatar
      Add style comments for coding style checker · cfe6626b
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      These file had allowed uses of routines that aren't always permitted
      according to our coding style.  These edits let the coding style
      checker know that these uses are ok (most uses of printf or malloc/free
      in MPICH are incorrect - see the style guide).
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
    • William Gropp's avatar
      Use MPIR_MIN/MPIR_MAX instead of MIN/MAX · c6f93012
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      The coding standards are clear on this; since we can't trust the system
      header files not to make definitions that conflict with ours, good
      coding practice is to make all MPICH names distinct.  The MPIR_MIN
      and MPIR_MAX definitions were made long ago for just this purpose.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
    • William Gropp's avatar
      Add style commands to generated files · 4db583ed
      William Gropp authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      The coding checker sometimes needs some hints that certain uses are
      permitted.  This adds the necessary information to some of the generated
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
    • Antonio J. Pena's avatar
      Fixed configure when disabling f77/fc · d4e30cc0
      Antonio J. Pena authored
      Fixes #2048
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
    • Rob Latham's avatar
      some lustre parameters larger than signed int · c8a19a9f
      Rob Latham authored
      Discussion with Michael Raymond: a customer would "like to use larger
      than 32-bit stripe units".  Technically, lustre only supports a 32 bit
      value for its stripe units -- an *unsigned* 32 bit value.
      Since we're here, let's double check that we do not try to jam a larger
      value into the lustre hint structure than lustre can support.
      Signed-off-by: default avatarMichael Raymond <mraymond@sgi.com>
    • Junchao Zhang's avatar
      Fix weak symbol redefinition error with Pathscale · 486cac76
      Junchao Zhang authored
      Pathscale C compiler reports symbol redefinition error for mpi_conversion_fn_null__ in such code
      extern int mpi_conversion_fn_null__ ( void*v1, MPI_Fint*v2, MPI_Fint*v3, void*v4, MPI_Offset*v5, MPI_Fint *v6, MPI_Fint*v7, MPI_Fint *ierr ) __attribute__((weak,alias("mpi_conversion_fn_null_")));
      \#pragma weak mpi_conversion_fn_null__ = mpi_conversion_fn_null_
      We should only generate one form code using either "__attribute__" or "#pragma weak"
      Fixes #2044
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
    • Huiwei Lu's avatar
      Fixes test scripts in test/mpi · fcbf5261
      Huiwei Lu authored
      Supports 'make dist', 'make -j' and others.
      1. Several tests has been added to test/mpi since the ticket 1720 was
      created. These new tests are added to corresponding Makefile.am.
      2. There is one minor thing I am not sure in this fix. I don't know the
      different of noinst_HEADERS and nodist_noinst_HEADERS, so I just put all
      head files under noinst_HEADERS (in test/mpi/Makefile.am).
      3. Windows project files like adapt.vcproj were deleted in commit
      , they should be removed in
      test/mpi/basic/Makefile.am too.
      4. MPICH ABI removed support for MPI::Distgraphcomm, so distgraphcxx.cxx
      is changed to be not compiled in cxx/topo.
      5. One minor change in help description in test/mpi/runtests.in. So when you
      run './runtests -batch', you will get help message directing you to
      Fixes #1720
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
  11. 15 Mar, 2014 1 commit
  12. 14 Mar, 2014 2 commits
  13. 13 Mar, 2014 4 commits