- 26 Jun, 2015 7 commits
-
-
Originally we poke the progress engine at the end of RMA sync calls if progress engine is never poked in this call before. The purpose of this is to prevent possible deadlock problem. However, the deadlock problem should only happen in self lock cases, if target is not myself, it add unnecessary overhead to RMA sync calls. In this patch, we delete those progress poking but only leave ones when target is myself. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
In this patch, we add a reduce-scatter based algorithm in MPI_Win_fence, which is triggered when number of processes is at a small / medium value. When this algorithm is being used, memory usage is O(P), but the ending FENCE only needs to wait for local completion but does not need to wait for remote completion. When number of processes is large, we switch FENCE to the original barrier based algorithm, which has O(1) memory usage, but needs to wait for the remote completion in the ending FENCE. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
On target side, after we receive the GACC/FOP packet, we should first start sending back the data, then perform ACC computation. By doing this issuing data and computation can be overlapped. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
This optimization was missed in 7189bcde . Here we add this back so that when there is no iSSUED active win or passive win, we ignore the while loop in RMA progress. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Originally the arguments passed to MPI_Win_create in this test was wrong. This patch fixed this issue. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 25 Jun, 2015 3 commits
-
-
Pavan Balaji authored
Signed-off-by:
Halim Amer <aamer@anl.gov>
-
Pavan Balaji authored
Signed-off-by:
Halim Amer <aamer@anl.gov>
-
Halim Amer authored
-
- 24 Jun, 2015 5 commits
-
-
Xin Zhao authored
In the Nemesis implementation of Win_gather_info(), we allocate a memory region on SHM to store window information for other processes, so that all processes on the same node can share those information. However, previously the memory size was incorrectly set as O(node_comm_size), which should be O(comm_size). This patch fixed this bug. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
The original implementation in Win_flush_local counts number of total local completion and remote completion needed to wait, and then waiting for current local/remote completion count to reach those values. There is a bug that we should initialize the current count to zero in each while loop, otherwise the targets that are already completed will be count again and we failed to wait for some targets to be completed. This patch fixes this issue. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
No reviewer. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Charles J Archer authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Rob Latham authored
No Reviewer
-
- 23 Jun, 2015 9 commits
-
-
Junchao Zhang authored
No reviewer
-
Rob Latham authored
Lisandro Dalcin <dalcinl@gmail.com> reports that mpi4py's test suite invokes MPIR_Add_finalize() 33 times. It's been 6.5 years since we doubled it, so bump it up once again. Closes: #2272 Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Rob Latham authored
disable this memory-intensive test on 32 bit platforms No Reviewer
-
Rob Latham authored
resize and struct are the two type constructors that can set the LB and UB markers on a type. Struct, due to MPI-1 ideas, is a strange beast (they adjust only if they are lower/higher than the old ones (!) ) but for resized it's clear that the markers shift. Closes #2088 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
in deep types we might want to update the lb and ub, not simply append/prepend two tuples to the flattened representation. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
some libraries like HDF5 want to register their cleanup routines into finalize. these cleanup routines use MPI-IO, so they need to fire before ROMIO cleans up. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Halim Amer authored
Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 22 Jun, 2015 4 commits
-
-
Rob Latham authored
type promotions have resulted in a change to the device layer. Ref: 1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov> Signed-off-by:
Sameh S Sharkawi <sssharka@us.ibm.com>
-
Rob Latham authored
despite promoting types throughout the gather path, still had one case of constructing structs with larger-than-int blocklens. solution: borrow BigMPI strategy and construct types-of-chunks to get around limitations. Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
The ongoing march towards 64-bit clean continues. Address areas where large product of two ints might have overflowed. Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Rob Latham authored
- preprocessor constants need parens - which showed the "always fail" case wasn't big enough - compiler warned about variables possibly being used uninitialized Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 20 Jun, 2015 3 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 19 Jun, 2015 2 commits
-
-
Kenneth Raffenetti authored
No reviewer.
-
Rob Latham authored
commit 83253a41 triggerd a bunch of new warnings. Take a different approach. For simplicity of implementation, do_accumulate_op is defined as MPI_User_function. We could split up internal routine and user-provided routines, but that complicates the code for little benefit: Instead, keep do_accumlate_op with an int type, but check for overflow before explicitly casting. In many places the count is simply '1'. In stream processing there is an interal limit of 256k, so the assertion should never fire. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 18 Jun, 2015 2 commits
-
-
Kenneth Raffenetti authored
Encode request handles in unused tag bits, eliminating the need for hash table. Signed-off-by:
Antonio Pena Monferrer <apenya@mcs.anl.gov>
-
Antonio Pena Monferrer authored
rma/manyget was failing because we reached the maximum number of MEs allowed: we were posting an ME for every remote get. This patch implements a single persistent ME instead for the remote gets. Fixes #2264 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 17 Jun, 2015 4 commits
-
-
Kenneth Raffenetti authored
Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
MPID_Segment_pack/unpack should be all we need to manipulate noncontig messages. Not sure why these were originally included. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Kenneth Raffenetti authored
Because we don't drain the EQs in the event of flow control, we need to use a dedicated EQ for messages related to pausing and unpausing communication. These messages are all consumed by the Rportals layer, the user will never see anything. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
Rob Latham authored
Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 16 Jun, 2015 1 commit
-
-
Loser of Head-to-Head connections are not necessarily closed, if the sock set is destroyed. This patch looks for all open connections, close the socket and free the memory recourses. Fixes #2180 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-