1. 20 Apr, 2015 5 commits
    • Xin Zhao's avatar
      Set size of IMMED data in RMA packets to 8 bytes. · de0412c2
      Xin Zhao authored
      
      
      Originally the size of IMMED data in RMA packets is 16 bytes
      which makes the size of CH3 packet be 56 bytes. Here we reduce
      the size of IMMED data in RMA packets to 8 bytes, so that the
      size of CH3 packet is reduced to 48 bytes, the same with
      mpich-3.1.4 (the old RMA infrastructure).
      Signed-off-by: default avatarMin Si <msi@il.is.s.u-tokyo.ac.jp>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      de0412c2
    • Xin Zhao's avatar
      Move 'stream_offset' out of RMA packet struct. · 19f29078
      Xin Zhao authored
      
      
      'stream_offset' is used to specify the starting position
      (on target window) of the current streaming unit in ACC-like
      operations. It is originally put in the RMA packet struct,
      which potentially increases the size of CH3 packet size.
      
      In this patch, we move 'stream_offset' out of the RMA
      packet as follows: 1. when target data is basic datatype,
      we use 'stream_offset' and the starting address for the entire
      operation to calculate the starting address for current
      streaming unit, and rewrite 'addr' in RMA packet with that
      value; 2. when target data is derived datatype, we cannot do
      the same thing as basic datatype because the target needs to
      know both the starting address for the entire operation and
      the starting address for the current streaming unit. Therefore,
      we send 'stream_offset' separately to the target side.
      Signed-off-by: default avatarMin Si <msi@il.is.s.u-tokyo.ac.jp>
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      19f29078
    • Antonio J. Pena's avatar
      Moved a request assert to an earlier location · c09f3969
      Antonio J. Pena authored
      An assert protecting from a non-null request was happening too late in pkt_COOKIE_handler from
      mpid_nem_lmt.c. This patch moves it to an earlier location so that it's checked before it's first
      used.
      
      Reported by Dmitry Polyakov.
      c09f3969
    • Charles J Archer's avatar
      OFI: Fix multiple providers failing probe-unexp · 4ef8d551
      Charles J Archer authored
      Update OFI netmod to match portals4 netmod anysource_matched semantics.
      4ef8d551
    • Charles J Archer's avatar
      OFI: Update for removed FI_CANCEL flag · c89e8d8e
      Charles J Archer authored
      c89e8d8e
  2. 17 Apr, 2015 9 commits
  3. 16 Apr, 2015 2 commits
  4. 15 Apr, 2015 3 commits
    • Pavan Balaji's avatar
      Increase bcast_full time limit. · 04060d1d
      Pavan Balaji authored
      We increased the number of cases the bcast test was running in
      [e01a20b6].  This is causing it to timeout on some platforms, where
      the test now seems to take close to 3 minutes.  This increased timeout
      should be sufficient on those platforms.
      
      No reviewer.
      04060d1d
    • Charles J Archer's avatar
      OFI Netmod: Add CVAR enhancements for OFI provider selection · 1ed2b434
      Charles J Archer authored
       * Rename MPIR_CVAR_DUMP_PROVIDERS to MPIR_CVAR_OFI_DUMP_PROVIDERS
       * Add MPIR_CVAR_OFI_USE_PROVIDER, which takes a string to desired
         provider name
      1ed2b434
    • Sameh Sharkawi's avatar
      PAMID: MPI_Allreduce/MPI_Reduce coredump w/ DOUBLE_INT datatype · e87c158f
      Sameh Sharkawi authored
      
      
      This commit includes multiple fixes:
       - Fixes for MPI_IN_PLACE checking. cudaGetPointerAttributes returns
         true on MPI_IN_PLACE which causes issues. Now we check on MPI_IN_PLACE
         before passing pointer to cuda.
       - Enabling PAMID geometries (in order to get to PAMID collectives) when
         MP_CUDA_AWARE=yes. This allows for intercepting CUDA buffer.
       - Disabling FCA when MP_CUDA_AWARE=yes if user enables FCA.
       - Copying user recv buffer into temp recv host buffer before collective
         starts, especially in MPI_IN_PLACE cases.
      
      (ibm) D203255
      Signed-off-by: default avatarTsai-Yang (Alan) Jea <tjea@us.ibm.com>
      e87c158f
  5. 14 Apr, 2015 2 commits
    • Min Si's avatar
      Fixed the Fortran common symbol issue on Mac. · eb0e7712
      Min Si authored
      
      
      The linker on Darwin does not allow common symbols, thus libtool adds
      the -fno-common option by default for shared libraries. However, the
      common symbols defined in different shared libraries and object files
      still can not be treated as the same symbol.
      For example:
      with gfortran, the same common block in the shared libraries and the
      object files will have different memory locations separately;
      with ifort, the same common block in different shared libraries will get
      the same memory location but still get a different location in the
      object file.
      
      The -Wl,-commons,use_dylibs option asks linker to check dylibs for
      definitions and use them to replace tentative definitions(commons) from
      object files, thus it solves the issue of the common symbol mismatch
      between the object file and the dylibs (i.e., by setting the address of
      a common symbol to the place located in the first dylib that is linked
      with the object file and contains this symbol). It needs to be added
      only in the linking stage for the final executable file.
      
      The -flat-namespace option allows linker to unify the same common
      symbols in different dylibs. It needs to be added in linking stage for
      both the shared library and the final executable file.
      (see man ld for their definition)
      
      Although gfortran works fine by only adding -flat-namespace, and ifort
      works by only adding -Wl,-commons,use_dylibs, we should add both options
      here as a generic solution to make sure everything safe.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      eb0e7712
    • Charles J Archer's avatar
  6. 11 Apr, 2015 1 commit
  7. 10 Apr, 2015 7 commits
    • Kenneth Raffenetti's avatar
      portals4: tuning · daf29e33
      Kenneth Raffenetti authored
      
      
      Changes the value of various static limits in the Portals4 netmod, based
      on experimentation results and suggestions from collaborators.
      
      1. Bump most ni_limits from 32K to 64K. These limits relate closely to
         queue depth. We can reasonably expect to support a queue depth
         of 64K.
      
      2. Limit issued origin events to 500. This translates to sending ~250
         operations to Portals at a time, which over IB is roughly the
         saturation point. TODO: turn this into a CVAR.
      
      3. Limit per target issued operations to 50. This will give the target a
         better chance to process events without being overwhelmed by a single
         process. TODO: turn this into a CVAR, also.
      
      4. Allocate more buffer space for incoming control messages. Observed
         results, especially with larger messages, showed that more buffer space
         cuts down on flow-control events.
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      daf29e33
    • Kenneth Raffenetti's avatar
      portals4: revert [722d85a4] and [d459c025] · 2f97f429
      Kenneth Raffenetti authored
      The 2 commits being reverted introduced a "safe" PtlMEAppend function
      that would call MPID_nem_ptl_poll to process some events in case there
      was no space to append the match list entry. However the poll function
      is not reentrant safe, which could lead to ordering problems.
      
      The increased list entry limit from [c6c0d6f6
      
      ] should prevent PTL_NO_SPACE
      errors from happening, except in the extreme case. If we still find we are
      hitting this error, a proper fix can be done in the Rportals layer.
      Signed-off-by: default avatarAntonio J. Pena <apenya@mcs.anl.gov>
      2f97f429
    • Charles J Archer's avatar
    • Pavan Balaji's avatar
      Update .gitignore. · 5addea2c
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      5addea2c
    • Pavan Balaji's avatar
      Simplify the bcast test. · e01a20b6
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      The current number of combinations we are checking are too many,
      causing the test to take too long on some platforms.  This patch
      simplifies the test, so we build two versions of the test.  In the
      first version, we run only on COMM_WORLD but go through all datatypes.
      In the second version, we run on all communicators, but go through
      only a small subset of datatypes.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      e01a20b6
    • Pavan Balaji's avatar
      Cosmetic changes to the bcast2 test. · be82b6a7
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      1. Renamed bcast2 to bcast.
      
      2. White-space cleanup for bcast.c
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      be82b6a7
    • Pavan Balaji's avatar
      Get rid of bcast3.c · e7eab9df
      Pavan Balaji authored and Kenneth Raffenetti's avatar Kenneth Raffenetti committed
      
      
      This test is exactly the same as bcast2.  Originally these two tests
      were different, but over time they have become essentially the same.
      There's no point testing the same thing twice.
      Signed-off-by: Kenneth Raffenetti's avatarKen Raffenetti <raffenet@mcs.anl.gov>
      e7eab9df
  8. 09 Apr, 2015 1 commit
  9. 08 Apr, 2015 2 commits
  10. 07 Apr, 2015 8 commits