- 20 Oct, 2014 6 commits
-
-
Wesley Bland authored
Tags were sometimes being tested when the communication call had already failed, leading to bad asserts later due to uninitialized values. Sometimes, these results would have ended up with an error anyway, but sometimes not. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Wesley Bland authored
The calls in MPID_Comm_get_all_failed_procs and MPID_Comm_agree were the wrong macros for entering and exiting an MPID function. This corrects it. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Huiwei Lu authored
Revoke will call MPIDI_CH3U_Clean_recvq to dequeue all requests with revoked communicators. There is one missing case: when there's hierarchy communicators that use a different context id. This patch adds a case to check the hierarchy communicators. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Huiwei Lu authored
Simplifies the test to only use 2 processes and not make as many MPI calls. Modifications by Wesley Bland. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Pavan Balaji authored
This makes sure that macros work the normal way (using a semicolon at the end, etc.) It also removes a block of unused code from mx_cancel.c. Modified by Wesley to split from previous patch. Signed-off-by:
Wesley Bland <wbland@anl.gov> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Pavan Balaji authored
We were not setting the function states correctly in a bunch of functions. Modifications by Wesley to split up big commit. Signed-off-by:
Wesley Bland <wbland@anl.gov> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 17 Oct, 2014 6 commits
-
-
The default linker behavior on BGQ makes interlibrary dependencies tricky to support correctly. Just disable them to make our lives easier. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
There are no pami optimized collectives available for inTERcomms so when MPIDI_Coll_comm_create is called it checks to see if the comm is an inTRAcomm and if not just returns, however it was doing the new malloc for comm->coll_fns before the if-check which was unsetting the MPICH defaults. The solution is to move the malloc after the inTRAcomm if-check. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
The optimized pami code currently within MPIDO_Ibarrier is incorrect - for now do not run it, instead just kick back to MPICH if mpir_nbc is set, otherwise call the blocking MPIR_Barrier(). Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
At the end of MPIDI_PAMI_context_init the MPIDI_Init_collsel_extension is called to enable the dynamic optimized collective advisor. This is not supported on BGQ so ifdef out the call. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
Since Blue Gene/Q does not support dynamic tasking there was only 1 element in the MPID_VCR_t data structure so a shortcut was taken to avoid a malloc and free of a new list of pami_task_t in a form the pami geometry creation was expecting. However it seems an array of structures with 1 pami_task_t element in it is not exactly the same in memory as an array of pami_task_t themselves so the pami geometry creation was failing. The fix is to simply do what all other platforms do and malloc a separate list of pami_task_t. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 16 Oct, 2014 2 commits
-
-
Kenneth Raffenetti authored
If a message size is <= PTL_LARGE_THRESHOLD, use a single operation. Previously, this would generate unnecessary 0-byte operations when messages were exactly the size of the threshold. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Kenneth Raffenetti authored
Previously, the message pointer in an improbe call in the portals4 netmod layer was set to MPI_MESSAGE_NULL if there was no match. This is incorrect because the ch3 layer eventually returns either a valid pointer or NULL. As ch3 already starts the value at NULL, we can just omit the update. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 14 Oct, 2014 4 commits
-
-
Wesley Bland authored
This CVAR was deprecated in MPICH 3.1 and will now be removed for MPICH 3.2. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Huiwei Lu authored
They are known to be failing. Mark them as xfail so they will not send false alarms to other patches. No reviewer.
-
We need to check both the build and src directories before installing the man and www pages. We were only checking the build directory for man and the src directory for www. Also, make sure to install both the man pages and the www pages on install. Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
Wesley Bland authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 13 Oct, 2014 1 commit
-
-
Kenneth Raffenetti authored
Remove incorrect event from send handler assertion. PTL_EVENT_PUT should only be seen at the target. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 10 Oct, 2014 4 commits
-
-
Wesley Bland authored
This reverts commit dd62e809.
-
Pavan Balaji authored
This doesn't pass with mpich yet. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Memory leak can appear in case netmod usage. Comm override functions are responsible for creating its own request. So they need to come before the sreq is created. Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Fixed wrong parameter check condition for MPI_Iallgather and MPI_Iallgatherv -1 is valid value for sendcount in case MPI_IN_PLACE MPI spec says: The in place option for intracommunicators is specified by passing the value MPI_IN_PLACE to the argument sendbuf at all processes. In such a case, sendcount and sendtype are ignored, and the input data of each process is assumed to be in the area where that process would receive its own contribution to the receive buffer. Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 09 Oct, 2014 1 commit
-
-
Igor Ivanov authored
Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com>
-
- 08 Oct, 2014 3 commits
-
-
Pavan Balaji authored
In some cases, we cannot let hcoll use whatever transport it needs. For example, ch3:sock assumes that while blocking the next event will come over the socket channel. If HCOLL decides to use a different transport (such as mxm) and the next event comes on that transport, this can result in a deadlock. In this patch, we let the channel specify what transports it can accept. Signed-off-by:
Devendar Bureddy <devendar@mellanox.com>
-
Pavan Balaji authored
Tell users that they can disable warnings if needed. Signed-off-by:
Devendar Bureddy <devendar@mellanox.com>
-
If the ref-count is 1, we *should* be the only thread holding a reference. It is therefore legal to skip atomics. If that is insufficient justificaion, consider that there are only three things a concurrent thread should be doing: 1) Reading: Not a problem, since the store is atomic. 2) Decrementing: Illegal; it would go negative using pure atomics. 3) Incrementing: Illegal; we are about to destroy the object. Add #ifdef guard for use of 'aggressive counter optimizations' (ibm) 20a558c3802fb3e138eb6b3c786a434538efffdd Signed-off-by:
Joe Ratterman <jratt@us.ibm.com> Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com> Updated by Pavan Balaji to use an OPA atomic load instead of a simple load operation. Fixes #1835 Signed-off-by:
Pavan Balaji <balaji@anl.gov> Signed-off-by:
William Gropp <wgropp@illinois.edu>
-
- 07 Oct, 2014 4 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 06 Oct, 2014 4 commits
-
-
Junchao Zhang authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
halim amer authored
This variable is incremented correctly by the MPIDI_CH3U_Recvq_FDP_or_AEU routine, but the MPIDI_CH3U_Recvq_DP fails to do so, although it also goes through the posted recv queue. For instance, the MXM Netmod calls this routine. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Sameh Sharkawi authored
The non-blocking neighborhood collectives were not set up correctly in pamid. ibm (D199919) Signed-off-by:
Su Huang <suhuang@us.ibm.com>
-
Pavan Balaji authored
We were only exposing a part of the allocated buffer as public memory, but attempting to use all of it remotely. Thanks to Charles Archer @ Intel for reporting the error and contributing the fix. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp>
-
- 05 Oct, 2014 1 commit
-
-
Wesley Bland authored
This reverts commit 32f3a924.
-
- 04 Oct, 2014 1 commit
-
-
Kenneth Raffenetti authored
Invalidate the handle of an ME that has been matched and handled in the portals4 netmod *before* dequeuing the request. The prevents potential double unlinking failures. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 03 Oct, 2014 3 commits
-
-
Previous code is based on a trick to get true context id using invalid communicator. There is a call of MPIR_Get_contextid() for new communicator that is not completelly built (no comm_commit and comm_hook calls). netmod/mxm uses comm_hook and can not work with this trick. This changes allow to avoid call of invalid communicator using temporary (intermediate) communicator. Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
This example does not rely on getcpu() or sched_getcpu() which are not available on old kernels/glibc. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- Fix issue in anysource match; - Improve debug output; Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-