- 10 Apr, 2015 3 commits
-
-
The current number of combinations we are checking are too many, causing the test to take too long on some platforms. This patch simplifies the test, so we build two versions of the test. In the first version, we run only on COMM_WORLD but go through all datatypes. In the second version, we run on all communicators, but go through only a small subset of datatypes. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
1. Renamed bcast2 to bcast. 2. White-space cleanup for bcast.c Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
This test is exactly the same as bcast2. Originally these two tests were different, but over time they have become essentially the same. There's no point testing the same thing twice. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 09 Apr, 2015 1 commit
-
-
Antonio Pena Monferrer authored
The datatype size was checked outside the appropriate branches in a couple of places Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 08 Apr, 2015 2 commits
-
-
Antonio J. Pena authored
This reverts commit b47d95f7.
-
Kenneth Raffenetti authored
The previous design for MPICH control messages utilized a small set of "use once" buffers that could be quickly exhausted. The new approach processes all control messages via an unexpected queue. Benefits are a larger incoming message capacity, leading to less flow-control events. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 07 Apr, 2015 14 commits
-
-
Norio Yamaguchi authored
Also change individual author to organization names. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Norio Yamaguchi authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Kenneth Raffenetti authored
Use the VC private area to track outstanding send operations. This way, when a VC close packet comes in, we wait until all remaining operations are complete before closing locally. This allows for a simpler netmod finalize function where we are sure the network is safe to shutdown. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Antonio J. Pena authored
The tests were modifying local buffers without locking them after window creation, causing potential race conditions. I've moved the buffer initialization to be performed before the global window is created. These tests were failing due to incorrect results in Jenkins whith async enabled. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Antonio J. Pena authored
The datatypes shouldn't be released until we make sure that there are no more remote operations using that datatype. I've changed several tests to release the datatype after a barrier. To avoid introducing a barrier in every iteration, and aiming to stress out a little more, I've restructured the tests so that the datatypes are not created and freed every iteration. This was causing intermittent segfaults mainly with async enabled. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 06 Apr, 2015 1 commit
-
-
Sameh Sharkawi authored
(ibm) D203212 Signed-off-by:
Coffman <pkcoff@bldlnx65.pok.stglabs.ibm.com> Signed-off-by:
Sameh Sharkawi <sssharka@us.ibm.com>
-
- 03 Apr, 2015 10 commits
-
-
Rob Latham authored
Instead of creating window at open time (depending on hints), let's deferr the window creation until we need it. Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com>
-
Optimization to use the PAMI_Rput_typed / PAMI_Rget_typed call in the case where PAMID MPI_Put / MPI_Get is called with a derived (non-contiguous) datatype. Instead of breaking the MPI datatype up into contiguous chunks on the MPICH side and repeatedly calling PAMI_Rput / PAMI_Rget for each chunk with the associated overhead, create a PAMI datatype to represent the MPI derived type and make just 1 call to PAMI_Rput_typed / PAMI_Rget_typed. We deal with non-contiguous buffers by avoiding packing and using origin buffers (as in PAMI) Guarded by the PAMID_TYPED_ONESIDED environment variable. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Added support to additionally run two-phase aggregation which has the read-modify-write capability in cases where the one-sided write aggregation encounters holes in the data. Additon of two new environment variables (GPFSMPIO_ONESIDED_NO_RMW, GPFSMPIO_ONESIDED_INFORM_RMW) to control this behavior and inform the user. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
read-modify-write for holes at the beginning Added support to correctly handle a data pattern that has a hole only at the beginning of the file offset range to essentially ignore the hole and begin writing at the first offset with actual data, thereby avoiding the need for a read-modify-write. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
source buffer bug fixes The CESM climate model decomps for fill-value support exposed several bugs in the algorithm related to non-contiguous source buffers which have been fixed. Those issues include: Mishandling of ranks with no data. Miscalculations of the source buffer offsets utilizing the flattened buffer mechanisms. Mishandling of negative source buffer offsets. Inefficient and inaccurate memory management of temporary buffers used to collect non-contigous chunks for a given file offset. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Code to enable the usage of the optimized one-sided collective IO aggregation algorithm from the ADIOI_GPFS_WriteStridedColl and ADIOI_GPFS_ReadStridedColl functions. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Optimized collective IO algorithm for GPFS to replace the existing two-phase algorithm with one utilizing one-sided MPI_Put and MPI_Get. Significant performance and memory optimization possible for certain workloads. Guarded by GPFSMPIO_AGGMETHOD environment variable, see ad_gpfs_tuning.c for details. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Optimized collective IO aggregation algorithm added for GPFS which replaces the existing two-phase aggregation with one-sided MPI_Put and MPI_Get for writing and reading respectively. Significant performance and memory optimization possible for many workloads. Guarded by the GPFSMPIO_AGGMETHOD environment variable -- see ad_gpfs_tuning.c for details. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Kenneth Raffenetti authored
Increases the number of matching list entries we can append. The netmod would bump into this limit on the Get portal when doing many sends above the large threshold. No reviewer.
-
Kenneth Raffenetti authored
When cleaning up uses of the max_eqs limit, I missed where we were passing it to the Rportals layer. Use EVENT_COUNT instead, same as the PtlEQAlloc calls. No reviewer.
-
- 02 Apr, 2015 2 commits
-
-
Rob Latham authored
This reverts commit 7a366031 . At one point (circa MPICH2-1.5) this clever optimization behaved as expected. Once ported forward to more recent MPICH, segfaults on trivial MPI applications. Closes: #2217 Reopens: #1835 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Sameh Sharkawi authored
This allows the systems that don't have GPUs installed and no libcuda to use the libmpi.so w/o getting unable to load lib errors (ibm) D203056 Signed-off-by:
Tsai-Yang (Alan) Jea <tjea@us.ibm.com> Signed-off-by:
Sameh Sharkawi <sssharka@us.ibm.com>
-
- 01 Apr, 2015 1 commit
-
-
Su Huang authored
(ibm) D203137 Signed-off-by:
Sameh Sharkawi <sssharka@us.ibm.com>
-
- 31 Mar, 2015 4 commits
-
-
Charles J Archer authored
-
Kenneth Raffenetti authored
Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
The max_eqs limit returned at PtlNIInit time is actually a limit on the number of event queues a particular interface can support. It is not the number of events a single queue can hold, which is how we were using it. Since there is no way to query for the max events a queue can hold, we simply request a conservative 32K. Thanks to Sayantan Sur for pointing out our error. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 30 Mar, 2015 1 commit
-
-
Charles J Archer authored
-
- 28 Mar, 2015 1 commit
-
-
Pavan Balaji authored
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-