- 16 Mar, 2015 6 commits
-
-
Sameh Sharkawi authored
MPICH collectives use MPIR_Localcopy to move data from src to destination buffer for same task. MPIR_Localcopy can't handle GPU buffers. A change in MPIR_Localcopy would affect all common code. This change is to handle the checking of GPU buffers in the PAMID collectives layer and allocate host buffer and copy data from GPU buffer to Host buffer and vice versa so MPIR_Localcopy would work w/o issues. This is not performance optimized code. (ibm) D202834 Signed-off-by:
Su Huang <suhuang@us.ibm.com>
-
Kenneth Raffenetti authored
Track this library separately so as not to impose its requirements when building other components like ROMIO or Hydra. No reviewer.
-
Kenneth Raffenetti authored
Commit [7dfe2840 ] broke builds with error checking disabled. We fix this by moving the guards to ensure the fn_exit and fn_fail labels are always present and only the error checking code is preprocessed out. Fixes #2225 Signed-off-by:
Rajeev Thakur <thakur@mcs.anl.gov>
-
Wesley Bland authored
This code was using uint32_t as a fixed size variable to simplify the code, but we don't currently require C99. The patch removes that usage and does everything with int's and more math. Fixes #2245 Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Wesley Bland authored
This is another function that is no longer used as it was replaced by a static function instead. This also removes an internal test that is no longer useful for this code. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Wesley Bland authored
Originally this function was implemented in the CH3 layer, but it was moved up to the MPI layer as an MPIR function. The old MPID version doesn't need to exist anymore. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 15 Mar, 2015 1 commit
-
-
Huiwei Lu authored
The Myrinet MX network module, which had a life cyle from 1.1 till 3.1.2, has now been deprecated. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 13 Mar, 2015 1 commit
-
-
Pavan Balaji authored
No reviewer.
-
- 10 Mar, 2015 1 commit
-
-
Wesley Bland authored
-
- 09 Mar, 2015 3 commits
-
-
Xin Zhao authored
File ch3u_rma_oplist.c contains functions that make progress on RMA on the origin side. Here we change the name of file to a more suitable one. No reviewer.
-
Kenneth Raffenetti authored
Removes additional whitespace in testlist files to make them easier to manipulate with tools like sed. No reviewer.
-
Kenneth Raffenetti authored
Put tests that check if MPI correctly detects aliased buffers in collective operations into the errors section of the testsuite. Fixes #2211 Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 06 Mar, 2015 3 commits
-
-
Norio yamaguchi authored
-
Norio yamaguchi authored
-
Wesley Bland authored
No reviewer
-
- 05 Mar, 2015 4 commits
-
-
Although MPIDI_CH3I_progress_blocked is a variable only used in CH3, it was referenced in the ROMIO glue code. This caused a build problem when pamid is used as a device. This patch removed the reference to MPIDI_CH3I_progress_blocked, but it degrades the efficiency of MPIR_Ext_cs_yield_allfunc_if_progress_blocked() since we do not have a way to check if the progress engine is blocked for now (related to ticket #2202). For a better solution for ticket #2202, we need to fix a wait function of the extended generalized request. Fixes #2242 Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Huiwei Lu authored
comm_idup was caught failing on mpich-portals4 with configuration "intel,strict,ib”. It was not fully tested on portals4 because portals was added after comm_dup patch. On other platforms comm_idup seems to be OK. Ticket #2243 No reviewer
-
Junchao Zhang authored
Since MPI-3.1 has not been voted this time. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Antonio J. Pena authored
-
- 04 Mar, 2015 21 commits
-
-
Antonio J. Pena authored
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Antonio J. Pena authored
This reverts commit 077346f6 . Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Antonio J. Pena authored
This reverts commit c2ea6afc . Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Antonio J. Pena authored
This reverts commit 38df8d2a . Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Antonio J. Pena authored
-
Antonio J. Pena authored
-
Antonio J. Pena authored
-
Wesley Bland authored
No reviewer
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
In MPI standard, predefined datatype is called as basic type. It is better to make the name same with the standard in the code. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
The implementations of sendNoncontig for intra-node communication in Nemesis and inter-node communication in network modules (except for TCP and SCIF) assume that req->dev.segment_first is zero and req->dev.segment_size is the size of data, which is not always true. If we stream an RMA operation and issue partial of derived data, req->dev.segment_first specifies the current starting location of the data and req->dev.segment_size specifies the current ending location of the data. Also, the data size should be (req->dev.segment_size - req->dev.segment_first). This patch corrects this issue in Nemesis and network modules. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
The original implementation of ACC/GACC on SHM first allocates a temporary buffer which has the same data layout as the target data, copies the entire origin data to that temporary buffer, and then performs the ACC computation between the temporary buffer and the target buffer. The temporary buffer can use potentially large amount of memory. This patch fixes this issue as follows: (1) SHM ACC/GACC routines directly call do_accumulate_op() function, which requires the origin data to be in a 'packed manner'; (2) if the origin data is basic type, we directly perform do_accumulate_op() between origin buffer and target buffer; if the origin data is derived, we stream the origin data by copying partial of origin data into a packed streaming buffer and performing do_accumulate_op() between the streaming buffer and target buffer each time. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
For queued ACC/GACC data piggybacked with LOCK, we do not need to allocate the buffer for the entire operation, but only need to allocate a buffer with stream unit size. This patch fixes this issue. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
On target side, we always allocate a SRBuf with 256K, which equals to the size of stream unit, to receive ACC/GACC data. Note that in MPIDI_CH3U_Request_load_recv_iov(), for ACC/GACC operations, since we already use SRBuf to receive the data at beginning, we will not use another SRBuf here, in order to avoid one more memory copy. Also, we pass the stream_offset in the current RMA packet to the request struct (when receiving is not finished) and do_accumulate_op function (when receiving is finished). Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Originally, do_accumulate_op() is used to perform the ACC computation on target between data from origin side and data on the target window. It requires that the target side must first unpack the received origin data into the same data layout as the target data before calling this function, which may consume potentially large of memory. This patch fixes do_accumulate_op() function in the following aspects: (1) It requires that the origin data passed to the function must be "in a packed manner", which means it looks as if all basic type elements in the origin data is placed one by one. Note that the origin data is not necessarily contiguous, since we may use non-contiguous basic type. If the basic type is contiguous, then the origin data must be contiguous. (2) It adds a new function argument, stream_offset, which specifies a starting location in the target data. This allows the origin data to work with partial of target data with stream size. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
This patch adds req types for FOP operation, and calls FOP req handler after SRBuf is unpacked. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Add stream_offset area into ACC-related packets and request struct to remember current stream unit's starting position in the entire target data. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Add a counter in op struct to remember number of stream units that have already been issued. For example, when the first stream unit piggybacked with LOCK is issued out, we temporarily stop issuing the following units. After the origin receives the ACK from the target, it can continue to issue the following units. This counter helps avoid issuing the first unit again. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-