1. 22 Apr, 2015 4 commits
    • Pavan Balaji's avatar
      Cleanup threaded progress. · f385680e
      Pavan Balaji authored
      
      
      The nemesis progress engine was written in a way so that if one thread
      is inside a progress engine, other threads cannot enter the receive
      progress.  They can enter the send progress in some cases.  There
      doesn't seem to be a good reason for this behavior.  This patch
      combines this so threads would simply return for nonblocking
      operations and wait for a signal before entering the progress engine
      for blocking operations.
      Signed-off-by: default avatarHalim Amer <aamer@anl.gov>
      f385680e
    • Sangmin Seo's avatar
      Fix wrong alias names. · 5fb750b9
      Sangmin Seo authored
      
      
      __attribute__((weak,alias())) should have function names starting with
      PMPI, but some MPIX functions, such as MPIX_Grequest_class_create,
      MPIX_Grequest_class_allocate, MPIX_Grequest_start, MPIX_Mutex_create,
      MPIX_Mutex_free, MPIX_Mutex_lock, and MPIX_Mutex_unlock, had the same
      alias names as those of original functions. This patch fixes wrong
      alias names in __attribute__((weak,alias())) and also fixes some wrong
      alias names in #pragma.
      Signed-off-by: default avatarHuiwei Lu <huiweilu@mcs.anl.gov>
      5fb750b9
    • Antonio J. Pena's avatar
      e60c9375
    • Kenneth Raffenetti's avatar
      mxm: fix anysource_matched · 1be5fc49
      Kenneth Raffenetti authored
      
      
      The return value of anysource_matched should be the actual result
      of the cancel operation. If the result is uncancelable, i.e. already
      matched, then CH3 will let the netmod message win and move on to the
      other requests in the queue. When the completion for the unsuccessfully
      canceled message comes in, we process it like normal.
      Reviewed-by: default avatarIgor Ivanov <Igor.Ivanov@itseez.com>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      1be5fc49
  2. 21 Apr, 2015 2 commits
  3. 20 Apr, 2015 9 commits
  4. 17 Apr, 2015 9 commits
  5. 16 Apr, 2015 2 commits
  6. 15 Apr, 2015 3 commits
    • Pavan Balaji's avatar
      Increase bcast_full time limit. · 04060d1d
      Pavan Balaji authored
      We increased the number of cases the bcast test was running in
      [e01a20b6].  This is causing it to timeout on some platforms, where
      the test now seems to take close to 3 minutes.  This increased timeout
      should be sufficient on those platforms.
      
      No reviewer.
      04060d1d
    • Charles J Archer's avatar
      OFI Netmod: Add CVAR enhancements for OFI provider selection · 1ed2b434
      Charles J Archer authored
       * Rename MPIR_CVAR_DUMP_PROVIDERS to MPIR_CVAR_OFI_DUMP_PROVIDERS
       * Add MPIR_CVAR_OFI_USE_PROVIDER, which takes a string to desired
         provider name
      1ed2b434
    • Sameh Sharkawi's avatar
      PAMID: MPI_Allreduce/MPI_Reduce coredump w/ DOUBLE_INT datatype · e87c158f
      Sameh Sharkawi authored
      
      
      This commit includes multiple fixes:
       - Fixes for MPI_IN_PLACE checking. cudaGetPointerAttributes returns
         true on MPI_IN_PLACE which causes issues. Now we check on MPI_IN_PLACE
         before passing pointer to cuda.
       - Enabling PAMID geometries (in order to get to PAMID collectives) when
         MP_CUDA_AWARE=yes. This allows for intercepting CUDA buffer.
       - Disabling FCA when MP_CUDA_AWARE=yes if user enables FCA.
       - Copying user recv buffer into temp recv host buffer before collective
         starts, especially in MPI_IN_PLACE cases.
      
      (ibm) D203255
      Signed-off-by: default avatarTsai-Yang (Alan) Jea <tjea@us.ibm.com>
      e87c158f
  7. 14 Apr, 2015 2 commits
    • Min Si's avatar
      Fixed the Fortran common symbol issue on Mac. · eb0e7712
      Min Si authored
      
      
      The linker on Darwin does not allow common symbols, thus libtool adds
      the -fno-common option by default for shared libraries. However, the
      common symbols defined in different shared libraries and object files
      still can not be treated as the same symbol.
      For example:
      with gfortran, the same common block in the shared libraries and the
      object files will have different memory locations separately;
      with ifort, the same common block in different shared libraries will get
      the same memory location but still get a different location in the
      object file.
      
      The -Wl,-commons,use_dylibs option asks linker to check dylibs for
      definitions and use them to replace tentative definitions(commons) from
      object files, thus it solves the issue of the common symbol mismatch
      between the object file and the dylibs (i.e., by setting the address of
      a common symbol to the place located in the first dylib that is linked
      with the object file and contains this symbol). It needs to be added
      only in the linking stage for the final executable file.
      
      The -flat-namespace option allows linker to unify the same common
      symbols in different dylibs. It needs to be added in linking stage for
      both the shared library and the final executable file.
      (see man ld for their definition)
      
      Although gfortran works fine by only adding -flat-namespace, and ifort
      works by only adding -Wl,-commons,use_dylibs, we should add both options
      here as a generic solution to make sure everything safe.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      eb0e7712
    • Charles J Archer's avatar
  8. 11 Apr, 2015 1 commit
  9. 10 Apr, 2015 7 commits
    • Kenneth Raffenetti's avatar
      portals4: tuning · daf29e33
      Kenneth Raffenetti authored
      
      
      Changes the value of various static limits in the Portals4 netmod, based
      on experimentation results and suggestions from collaborators.
      
      1. Bump most ni_limits from 32K to 64K. These limits relate closely to
         queue depth. We can reasonably expect to support a queue depth
         of 64K.
      
      2. Limit issued origin events to 500. This translates to sending ~250
         operations to Portals at a time, which over IB is roughly the
         saturation point. TODO: turn this into a CVAR.
      
      3. Limit per target issued operations to 50. This will give the target a
         better chance to process events without being overwhelmed by a single
         process. TODO: turn this into a CVAR, also.
      
      4. Allocate more buffer space for incoming control messages. Observed
         results, especially with larger messages, showed that more buffer space
         cuts down on flow-control events.
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      daf29e33
    • Kenneth Raffenetti's avatar
      portals4: revert [722d85a4] and [d459c025] · 2f97f429
      Kenneth Raffenetti authored
      The 2 commits being reverted introduced a "safe" PtlMEAppend function
      that would call MPID_nem_ptl_poll to process some events in case there
      was no space to append the match list entry. However the poll function
      is not reentrant safe, which could lead to ordering problems.
      
      The increased list entry limit from [c6c0d6f6
      
      ] should prevent PTL_NO_SPACE
      errors from happening, except in the extreme case. If we still find we are
      hitting this error, a proper fix can be done in the Rportals layer.
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      2f97f429
    • Charles J Archer's avatar
    • Pavan Balaji's avatar
      Update .gitignore. · 5addea2c
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      5addea2c
    • Pavan Balaji's avatar
      Simplify the bcast test. · e01a20b6
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      The current number of combinations we are checking are too many,
      causing the test to take too long on some platforms.  This patch
      simplifies the test, so we build two versions of the test.  In the
      first version, we run only on COMM_WORLD but go through all datatypes.
      In the second version, we run on all communicators, but go through
      only a small subset of datatypes.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      e01a20b6
    • Pavan Balaji's avatar
      Cosmetic changes to the bcast2 test. · be82b6a7
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      1. Renamed bcast2 to bcast.
      
      2. White-space cleanup for bcast.c
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      be82b6a7
    • Pavan Balaji's avatar
      Get rid of bcast3.c · e7eab9df
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      This test is exactly the same as bcast2.  Originally these two tests
      were different, but over time they have become essentially the same.
      There's no point testing the same thing twice.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      e7eab9df
  10. 09 Apr, 2015 1 commit