- 16 Dec, 2014 10 commits
-
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
In this patch we allow GET/GACC response packets to piggyback some IMMED data, just like what we did for PUT/GACC/FOP/CAS packets. No reviewer.
-
Xin Zhao authored
Originally we only allows LOCK request to be piggybacked with small RMA operations (all data can be fit in packet header). This brings communication overhead for larger operations since origin side needs to wait for the LOCK ACK before it can transmit data to the target. In this patch we add support of piggybacking LOCK with RMA operations with arbitrary size. Note that (1) this only works with basic datatypes; (2) if the LOCK cannot be satisfied, we temporarily buffer this operation on the target side. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
accumulated_ops_cnt is used to track no. of accumulated posted RMA operations between two synchronization calls, so that we can decide when to poke progress engine based on the current value of this counter. Here we initialize it to zero in the BEGINNING synchronization calls (Win_fence, Win_start, first Win_lock, Win_lock_all), and correctly decrement it in the ENDING synchronization calls (Win_fence, Win_complete, Win_unlock, Win_unlock_all, Win_flush, Win_flush_local, Win_flush_all, Win_flush_local_all). We also use a per-target counter to track single target case. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
Arrange RMA packet definition and structures in src/mpid/ch3/include/mpidpkt.h in the following order: 1. RMA operation packets: PUT, GET, ACC, GACC, CAS, FOP 2. RMA operation response packets: GET_RESP, GACC_RESP, CAS_RESP, FOP_RESP 3. RMA control packets: LOCK, UNLOCK, FLUSH, DECR_AT_COUNTER 4. RMA control response packets: LOCK_ACK, FLUSH_ACK No reviewer.
-
Xin Zhao authored
Arrange RMA sync functions in src/mpid/ch3/src/ch3u_rma_sync.c in the following order: Win_fence Win_post Win_start Win_complete Win_wait Win_test Win_lock Win_unlock Win_flush Win_flush_local Win_lock_all Win_unlock_all Win_flush_all Win_flush_local_all Win_sync No reviewer.
-
- 11 Dec, 2014 1 commit
-
-
Charles J Archer authored
-
- 09 Dec, 2014 2 commits
-
-
Charles J Archer authored
-
Charles J Archer authored
-
- 08 Dec, 2014 1 commit
-
-
This patch provides the following fix wrt Windows conformance feature (makes single code working on both platforms Linux and Windows): - RMA mutexes fix for Windows Change-Id: Ib4f7b2ec8a07813f0ed35281a1d584637c84c0a9 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 05 Dec, 2014 3 commits
-
-
Kenneth Raffenetti authored
No reviewer
-
- name publishing/lookup support (pmi v1 and v2) - job and node attrs (v2) Change-Id: Id18d968da0d0bbf6e8cb2e7acffaf77d82a5e8b0 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
The original fix for 'romio gpfs: select correct read buffer' was still missing a critical piece for the last round to use the correct read buffer, resulting in a correctness issue that was missed by IOR but still found by the IBM PE test team. The fix was to correctly toggle the buffer after the last read. Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 04 Dec, 2014 1 commit
-
-
Min Si authored
This test passed a 0 size to win_create which is translated to a integer(32bit) var by fortran compiler and passed to c mpi_win_create as an invalid MPI_Aint(64bit) var by fortran binding because prototype checking is not supported. This test can be failed if mpi_win_create internally initializes resource related to the value of size (i.e., mxm maps win buffer in win_init). This patch fixed this issue by passing a 64bit local variable as size parameter instead of a constant var 0 in this f90 test. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 03 Dec, 2014 4 commits
-
-
Wesley Bland authored
No reviewer
-
James Dinan authored
Test for correct error class when a dynamic error code is created from a predefined error class. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
James Dinan authored
During error code creation, the error class was erroneously modified by applying ERROR_DYN_MASK when. The dynamic bit is already set for user-defined error classes, so this bug had no effect in all existing MPICH tests. However, when a predefined error class was passed during error code creation, it would be incorrectly marked as dynamic, resulting in an invalid result when the error class of a returned error code was returned via MPI_Error_class. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Kenneth Raffenetti authored
Add the MPICH copyright header and remove C99 variable declaration in for loop. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 02 Dec, 2014 1 commit
-
-
Type of the third argument for MPIR_Pack_size_impl should be a pointer to MPI_Aint. This patch fixes the wrong usage of int pointer for MPIR_Pack_size_impl in NEWMAD and MXM netmods. Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 28 Nov, 2014 3 commits
-
-
Pavan Balaji authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Pavan Balaji authored
Also remove the function prototype declaration since it is not used out-of-order. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Pavan Balaji authored
Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 26 Nov, 2014 7 commits
-
-
Wesley Bland authored
This test was left out of the testlist for some reason No reviewer
-
Wesley Bland authored
The function to convert the group of failed procs to a bitarray was incorrectly quiting early if one of the globally known failed processes was not in the communciator being dealt with. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Wesley Bland authored
Since pamid doesn't include any of the fault tolerance functions, it should never say that a message is pending failure. We also can't call abort in here since the function is usind at the MPI layer. Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com>
-
Kenneth Raffenetti authored
MPICH now behaves correctly for this test. There is no reason for it to output " No errors", since the only thing we are testing for is that it does not timeout. We also use a non-zero error code in MPI_Abort to fit the requirements of the test runner. Closes #1537 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Kenneth Raffenetti authored
If a fatal error occurs, pass the MPI error code to MPID_Abort. To ensure non-zero exit status with dynamic error codes, we set the first available dynamic error class to 1. #Refs 1537 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Kenneth Raffenetti authored
Implement abort in the Hydra PMI server and modify simple PMI to send an abort command. Previously, we just exited the calling process and relied on the process manager to detect it and cleanup the rest of the job. Refs #1537 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Kenneth Raffenetti authored
We simply use PMI_Abort in both the sock and nemesis code. Remove extra functions and constants that are not useful. Refs #1537 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 24 Nov, 2014 3 commits
-
-
ROMIO GPFSMPIO_P2PCONTIG threaded read needs to toggle first read buffer When using both the GPFSMPIO_P2PCONTIG and GPFSMPIO_PTHREADIO optimizations there was a correctness bug when reading where for the first round the read buffer did not toggle to the two-phase buffer for the pthread reader, resulting in diseminating the data from the wrong buffer. The fix is to do the toggle after the first read. Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Xin Zhao authored
It is possible that a request handler of RMA request is called for the second time inside the first called request handler on the same request. Consider the following case: a req is queued up in Nemesis SHM queue with ref count of 2: one is for request completion and another is for dequeueing from SHM queue. The first called req handler completed this request and decrement ref count to 1. This request is still in the queue. However, within this handler, we trigger the same req handler on the same request again (for example making progress on SHM queue), and the second called handler also tries to complete this request, which leads to the wrong execution. In this patch we check if request has already been completed when entering the req handler, to prevent processing the same request twice. We also move the function finish_op_on_target() (where the same req handler can be triggered again) after request completion routine, so that we can mark the current request as completed before enter the same req handler for the second time. Fix #2204 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Update the use of DOCTEXT to match the rest of MPICH, including adding -nolocation (drop the location of the source file from the documentation) and ensure that the mpi.cit file contains the I/O routines as well as the others (this file can be used to add links to the man pages in other documents). Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 23 Nov, 2014 2 commits
-
-
Min Si authored
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Min Si authored
Three datatype test levels are defined: basic,min,full(default full). The default level can be overwritten in runtime by setting environment variable MPITEST_DATATYPE_TEST_LEVEL. An MPI test can also specify different level for each datatype loop by calling corresponding datatype test initialization function before that loop, otherwise the default version is used. Basic : MTestInitBasicDatatypes Minimum : MTestInitMinDatatypes Full : MTestInitFullDatatypes Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 21 Nov, 2014 2 commits
-
-
* Implements a tag matching interface netmod over the OFIWG Scalable Fabric Interfaces (SFI)
-
At the end of MPIDI_Init_collsel_extension in the pami device init code mpid_init.c there is logic to disable the optimized collectives based on criteria that is invalid on BGQ but was nonetheless always evaluating to true and disabling the optimized collectives on BGQ. Compiler directives were placed around the logic to avoid this code for the BGQ platform. Signed-off-by:
Paul Coffman <pkcoff@us.ibm.com> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-