- 08 Aug, 2013 4 commits
-
-
Wesley Bland authored
The error message in MPID_nem_tcp_bind line 580 was incorrectly reporting the port that was being attempted due to incrementing the variable at the end of the for loop and not accounting for that in the error message. This patch subtracts one if the port was not succesfully bound when printing out the value for the error message. Thanks to Yauheni Zelenko for reporting this error. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
There are three configure options and two install cases that need to be handled. The configure could have be done on a bgq system that has V1R2M1+ installed or on a bgq system that has a pre-V1R2M1 installation. In V1R2M1 the location and names of the pami libraries changed. The three configure options are: --with-bgq-install-dir ..... forces use of a specific bgq install --with-pami[-include|-lib] . forces use of a specific pami install PAMILIBNAME ................ forces use of a specific pami library name Signed-off-by:
Bob Cernohous <bobc@us.ibm.com> Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
Increase the number of RMA operations issued within the epoch. The error seems to happen rarely, so increasing the number of RMA operations increases the probability of it occurring. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Pavan Balaji authored
We didn't change ABI from 3.0.x, so the 'c' value should still be 10.
-
- 07 Aug, 2013 7 commits
-
-
Pavan Balaji authored
-
Kenneth Raffenetti authored
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
We were jumping to exit early causing us to try to free an unitialized string stash list. We should really use C++ constructor-style allocation functions for this. This commit just moves the initialization early, so we don't try to free garbage when we jump to exit. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
We need the environment-passed parameters to determine what thread-level needs to be used. Fixes ticket #1892. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Do not overload the 'mpid.next' pointer in the request as this field is defined by the mpid layer and therefore is specific to the device. Not all adi implementations have this field. Signed-off-by:
Haizhu Liu <haizhu@us.ibm.com> Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
This patch overloads the MPID_Request.mpid.next pointer to use for referencing the debugger SEND queue structure to facilitate faster removal during completion. MPID_Request.mpid.next is only used by receives and free-pools otherwise. (ibm) Issue 8178 (ibm) e1c4d5cfe88fccc67624f81497a74bc55d774e43 Signed-off-by:
Su Huang <suhuang@us.ibm.com> Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com> Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
(ibm) bd92b21a52afd8f4f6ef018d764d50060966f20b Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com> Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 06 Aug, 2013 5 commits
-
-
Pavan Balaji authored
Allows the minimal configure ../configure --host=powerpc64-bgq-linux --with-device=pamid --enable-thread-cs=per-object. Without this change the MPID_REQUEST_PREALLOC symbol would be undefined. Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com>
-
Removed the processing of the (undocumented) environment variable 'PAMID_CORE_ON_ABORT' which was being checked to determine if the user does *not* want the process to core dump. On Blue Gene/Q the core dump was accomplished by calling 'abort()' which sends SIGSBRT to all processes and all processes would then write a core file. This is not scalable. Instead, MPID_Abort() will invoke 'exit(1)' which will terminate all processes in the job. This behavior is identical for both the POE and the Blue Gene/Q control systems. On Blue Gene/Q the user may replicate the previous core dump behavior by using the environment variables 'BG_COREDUMPONERROR=1' or 'BG_COREDUMPONEXIT=1'. Finally, the 'DYNAMIC_TASKING' #ifdef is moved up so it is checked first. 'MPIDI_NO_ASSERT' and 'DYNAMIC_TASKING' are typically defined for PE. It appears that the dynamic tasking code was never being invoked. (ibm) CPS 99YURA Signed-off-by:
Bob Cernohous <bobc@us.ibm.com>
-
These tests should not be distributed with the test suite when grabbing a release. They are still enabled for developer builds. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
-
Charles Archer authored
Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com>
-
- 05 Aug, 2013 1 commit
-
-
Kenneth Raffenetti authored
Adds an FT test that attempts communication without touching a failed process. Other changes are to use SIGKILL to simulate failures and also to flush all output to stdout. Signed-off-by:
Wesley Bland <wbland@mcs.anl.gov>
-
- 03 Aug, 2013 1 commit
-
-
Pavan Balaji authored
This update fixes the following problems: * Unexpected connection, mostly appears with number of ranks > 3 * Several problems related to SCIF DMA usage * Intel(R) Symmetric Communication Interface (Intel(R) SCI) registered memory used for DMA is not unregistered yet Current status: * IMB-MPI1 w/ CHECK runs well * MPICH tests run except spawn tests Known issues: * Spawn tests still fail
-
- 02 Aug, 2013 4 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Rob Latham authored
most of this stuff is either broken, for ancient platofrms, or both. Let's re-commit ourselves to using autoconf as it was intended, testing for features, not platforms.
-
Pavan Balaji authored
Fixes ticket #1862. Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
William Gropp authored
Added test to help check #1815, as weak symbol support is used to implement the PMPI and MPI routines in a single object file, and there are few tests for correct operation of the PMPI routines in the test suite (a more comprehensive test of PMPI could use CPP macros to convert all existing tests to try PMPI, but typically either the PMPI routines are all available, or there is a problem.
-
- 01 Aug, 2013 18 commits
-
-
Pavan Balaji authored
This seems to fail with MPICH for >= 4096 back-to-back RMA operations. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Pavan Balaji authored
-
shm_base_addrs is allocated only when shared memory region is allocated in nemesis. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Default set 'alloc_shm' info to TRUE. If 'alloc_shm' is set by user and value is not TRUE, throw an error. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Assign new MPIDI_CH3U_Win_allocate to (*allocate_shared) so that it will be called by MPID_Win_allocate_shared. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
In the code, some places use (void**)base_ptr and some places use (void*)base_ptr which are confusing. We make every win_allocate related function use (void*)base_ptr as the argument and do conversion inside the function, in order to keep the code clear. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Because we delete the default MPIDI_SHM_Win_free in CH3, which originally deals with win_allocate_shared in CH3, we need to modify MPIDI_Win_free to handle it. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Also add prototype in header file. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
In the code, some places use (void**)base_ptr and some places use (void*)base_ptr which are confusing. We make every win_allocate related function use (void*)base_ptr as the argument and do conversion inside the function, in order to keep the code clear. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
In nemesis layer, change function name from MPIDI_CH3I_Win_allocate_shared to MPIDI_CH3I_Win_allocate_shm. MPIDI_CH3I_Win_allocate_shm is assigned to (*allocate_shm). Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
When judging if origin and target process are on the same node, using vc->node_id flag instead of vc->ch.is_local flag. Flag 'is_local' is not correct because it is defined in nemesis, not in CH3. Flag 'node_id' is defined in CH3. Note that for ch3:sock, even if origin and target are on the same node, they are not within the same SHM region. Currently ch3:sock is filtered out by checking shm_allocated flag first. In future we need to figure out a way to check if origin and target are within the same "SHM comm". Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Because if MPI_WIN_FLAVOR_SHARED is used in ch3:sock, it will allocate normal memory instead of shared memory, therefore shm_base_addrs will not be used. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Because CH3 layer needs to know if shared memory region is allocated in lower layer. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-