- 23 Jun, 2015 4 commits
-
-
Rob Latham authored
some libraries like HDF5 want to register their cleanup routines into finalize. these cleanup routines use MPI-IO, so they need to fire before ROMIO cleans up. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Halim Amer authored
Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 22 Jun, 2015 4 commits
-
-
Rob Latham authored
type promotions have resulted in a change to the device layer. Ref: 1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov> Signed-off-by:
Sameh S Sharkawi <sssharka@us.ibm.com>
-
Rob Latham authored
despite promoting types throughout the gather path, still had one case of constructing structs with larger-than-int blocklens. solution: borrow BigMPI strategy and construct types-of-chunks to get around limitations. Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
The ongoing march towards 64-bit clean continues. Address areas where large product of two ints might have overflowed. Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
- preprocessor constants need parens - which showed the "always fail" case wasn't big enough - compiler warned about variables possibly being used uninitialized Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 20 Jun, 2015 3 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 19 Jun, 2015 2 commits
-
-
Kenneth Raffenetti authored
No reviewer.
-
Rob Latham authored
commit 83253a41 triggerd a bunch of new warnings. Take a different approach. For simplicity of implementation, do_accumulate_op is defined as MPI_User_function. We could split up internal routine and user-provided routines, but that complicates the code for little benefit: Instead, keep do_accumlate_op with an int type, but check for overflow before explicitly casting. In many places the count is simply '1'. In stream processing there is an interal limit of 256k, so the assertion should never fire. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 18 Jun, 2015 2 commits
-
-
Kenneth Raffenetti authored
Encode request handles in unused tag bits, eliminating the need for hash table. Signed-off-by:
Antonio Pena Monferrer <apenya@mcs.anl.gov>
-
Antonio Pena Monferrer authored
rma/manyget was failing because we reached the maximum number of MEs allowed: we were posting an ME for every remote get. This patch implements a single persistent ME instead for the remote gets. Fixes #2264 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 17 Jun, 2015 4 commits
-
-
Kenneth Raffenetti authored
Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
MPID_Segment_pack/unpack should be all we need to manipulate noncontig messages. Not sure why these were originally included. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
Because we don't drain the EQs in the event of flow control, we need to use a dedicated EQ for messages related to pausing and unpausing communication. These messages are all consumed by the Rportals layer, the user will never see anything. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Rob Latham authored
Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 16 Jun, 2015 4 commits
-
-
Loser of Head-to-Head connections are not necessarily closed, if the sock set is destroyed. This patch looks for all open connections, close the socket and free the memory recourses. Fixes #2180 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
The loser of a head-to-head connection sometimes tries to reconnect later, afer MPI_Finalize was called This can lead to several errors in the socket layer, depending on the state of the disarded connection and the appereance of the connection events. Refs #2180 This Patch has two ways to handle this: 1.) Discarded connections are marked with CONN_STATE_DISCARD, so they are hold from connection. Furthermore, an error on any discarded connection (because the remote side closed in MPI_Finalize) is ignored and the connection is closed. 2.) Add a finalize flag for process groups. If a process group is closing and tries to close all VCs, a flag is set to mark this. If the flag is set, a reconnection (in the socket state) is refused and the connection is closed on both sided. Both steps are necessary to catch all reconnection tries after MPI_Finalize was called. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Kenneth Raffenetti authored
Ignore local completion events (SENDs) when counting outstanding ops to remote targets. No reviewer.
-
Kenneth Raffenetti authored
No reviewer.
-
- 15 Jun, 2015 5 commits
-
-
Originally Request_load_recv_iov() function assumes that the initial value of req->dev.segment_first is always zero, which is not correct if we set it to a non-zero value for streaming the RMA operations. The way Request_load_recv_iov() works is that, it is triggered multiple times for the same receiving request until all data is received. During this process, req->dev.segment_first is rewritten to the current offset value. When the initial value of req->dev.segment_first is non-zero, we need another variable to store that value until the receiving process for this request is finished. Here we use a static variable in this function to reach the purpose. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
In this patch, we fix the mistakes in calculating the streaming size in GetAccumulate pkt handler on the target side. The original code has two mistakes here: 1. The original code use the size and extent of the target datatype, which is wrong. Here we should use the size / extent of the basic type in the target datatype. 2. The original code always use the total data size to calculate the current streaming size, which is wrong. Here we should use the current rest data size to calculate. This patch fixes these two issues. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Here we assign req->dev.segment_first to orig_segment_first. Since req->dev.segment_first is a MPIDI_msg_sz_t type, we should use the same type for orig_segment_first. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
This test occasionally run more than 3 min (default time limit) on OFI platform. This patch increases the time limit to 5 min. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Lena Oden authored
This test uses irecv and isend to transfer data in an alltoall manner between multiple processes. The idea of this test is testing, if MPI can handle multiple processes trying to connect to each other from both sides at the same time. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 14 Jun, 2015 3 commits
-
-
This patch includes three changes: (1) Added netmod API get_ordering to allow netmod to expose the network ordering. A netmod may issue some packets via multiple connections in parallel if those packets (such as RMA) do not require ordering, and thus the packets may be unordered. This patch sets the network ordering in every existing netmod (tcp|mxm|ofi|portals|llc) to true, since all packets are sent orderly via one connection. (2) Nemesis exposes the window packet orderings such as AM flush ordering at init time. It supports ordered packets only when netmod supports ordered network. (3) If AM flush is ordered (flush must be finished after all previous operations), then CH3 RMA only requests FLUSH ACK on the last operation. Otherwise, CH3 must request per-OP FLUSH ACK to ensure all operations are remotely completed. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
When win resource is used up, the current code frees OPs before completion only if flush_remote is ordered. However, we can always free them even on out-of-order network. Because remote completion is waited by ack counter, and local completion (flush_local) is translated to remote completion (flush). Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
The outstanding_acks counter was increased at each sync call (such as fence and flush). However, the counter had to be decreased again if flush ack is not required. It is more straightforward if increasing it only when the flush packet is issued (FLUSH flag piggyback or a separate flush message). Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 12 Jun, 2015 9 commits
-
-
Pavan Balaji authored
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Here we make check_and_switch_target/window_state to return a flag indicating if the current window/target states are OK for issuing operations. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
check_window_state ---> check_and_switch_window_state check_target_state ---> check_and_switch_target_state Both of those two functions are used to check and switch (if possible) RMA state. Here we change their name to proper ones. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
When GACC/FOP is used with MPI_NO_OP, the operation is essentially an atomic GET. Originally MPICH implemented this by converting GACC/FOP to GET, which lost the atomicity of that operation. In this patch, we modify the implementation of GACC/FOP to support atomic GET. Main modifications are listed below: (1) When streaming GACC operation, originally we use origin data size to calculate the stream unit size. Since origin data is zero in atomic GET, here we use target data size instead to calculate the stream unit size. (2) On the origin side, if it is atomic GET, CH3 just issues packet header and metadata for derived datatypes (if needed) and does not try to issue from origin buffer; on the target side, after packet header and metadata for derived datatypes (if needed) are received, the final request handler is triggered, CH3 does not try to receive any data from origin. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Originally MPICH check datatype of FOP by judging if it is a BUILTIN type, this prohibits all pair types. This patch fix this issue. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-