- 29 Jan, 2014 7 commits
-
-
When qsort is not available, don't define comparision function and fallback to simple insertion sort implementation. In the future, a more general function with fallback should be added in MPL so it can be used in other cases like comm_split. Refs #2007 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
The original PMI process mapping parsing code had a number of assumptions that would allow it to only work on COMM_WORLD. This patch corrects these to work for dynamic processes as well. It also corrects the evaluation of the number of nodes used to be correct in the general case. Refs #2007. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Pavan Balaji authored
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Pavan Balaji authored
This reverts commit 058c8bf0. Refs #1996.
-
Some netmods can't handle FT right now. To allow the test suite to work properly on those netmods, this adds an option for those tests to be disabled at configure time using the flag --disable-ft-tests. Fixes #2005 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
During finalize, we were destroying the COMM_WORLD, COMM_SELF and COMM_IWORLD communicator objects, and all other associated resources internally, before waiting for the final progress checks for incoming messages finished. This resulted in the following sequence of cleanup: 1. COMM_WORLD got cleaned up. Internally, there is a check to see if a group object has been allocated for COMM_WORLD. If there is one, it is freed up. 2. We waited for other messages to arrive. We noticed a failure at this time, so we try to create a failed process group. This uses the COMM_WORLD group internally, causing it to be created again, but with a reference count of 2, since the code assumes that the first reference count is always for the original COMM_WORLD. 3. When we try to free the world group, we notice that the reference count is 2, so we decrement the reference count and not actually free the object. Moving the check for incoming messages to happen before the communicator free fixes this problem. See #1996 Signed-off-by:
Wesley Bland <wbland@mcs.anl.gov>
-
Kenneth Raffenetti authored
No reviewer.
-
- 27 Jan, 2014 4 commits
-
-
Kenneth Raffenetti authored
Add a number of synthetic topologies and reference binding bitmaps to the implementation specific tests directory. Running the proc_binding.sh script will show any errors in hydra's binding logic. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Kenneth Raffenetti authored
Output processor affinity info as a single string. This should prevent mangled lines in the debugging output in most cases. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Wesley Bland authored
No reviewer
-
Resets MPIDI_TAG_UB back to 0x7fffffff. This value was changed a while back, but the change should have happened at the MPI layer instead of the CH3 layer. This resets the value to allow CH3 to use the tag space. Instead, the value is now set in the MPI layer during initthread. This means that it will be safe regardless of the device being used. This prevents a collision that was occurring on the pamid device where the values for MPIR_TAG_ERROR_BIT and the MPIR_Process.attr.tagged_coll_mask values were the same. Fixes #2008 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 26 Jan, 2014 6 commits
-
-
Users can specify the envvar HYDRA_TOPO_DEBUG such that hydra will print out the cpu bindings for an MPI job, then run without actually binding the processes. This is useful for debugging with hwloc's arbitrary topology loading functionality. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Hydra does not currently support user-defined process mapping strings. Remove the help text for now. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Significant rework of the process binding/mapping features in hydra. There were a number of bugs in the existing code. This commit addresses them and simplifies the binding/mapping logic. It also makes binding/mapping options more permissive. If a user specifies a system element that does not exist in the process affinity options, use the next largest element in the topology. This makes things safer for systems in which hwloc does not report certain elements, e.g., as an Haswell-based MBP that shows no sockets, but does have a NUMA node. Other comments: 1. User-defined mapping strings (e.g. TCSNB) are no longer supported. Support may be added back at a later time, depending on user feedback. 2. Properly support cache-level binding/mapping. To accomodate all levels of processor cache, we define objects by their absolute depth in the topology. 3. Allocate the correct number of binding/mapping combinations given the user-provided options, and populate them accordingly. Refs #1858 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Add a program in the hydra examples directory for use with the hydra binding/mapping options. This program will print out which CPUs it is allowed to run on according to the OS. Refs #1858 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Move the bitmap allocation out of init so we can correctly allocate what is needed for a user binding. Remove unnecessary duplicate code and use simpler hwloc provided functions where possible. Refs #1858 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 24 Jan, 2014 1 commit
-
-
Fixes errors in commit [5bbfe808]. 1. Run config.rpath with each compiler individually, as the syntax may differ for passing options through to the linker. 2. Move the rpath flags to just before the mpich library, where they are necessary. Fixes #1044. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 21 Jan, 2014 3 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com>
-
Kenneth Raffenetti authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 20 Jan, 2014 1 commit
-
-
Rob Latham authored
This reverts commit 38ef5818. the MPICH-1 and Intel tests found unexpected results with these optimizations. Will explore later. Conflicts: src/mpid/common/datatype/dataloop/dataloop_optimize.c src/mpid/common/datatype/mpid_type_debug.c
-
- 19 Jan, 2014 4 commits
-
-
Fix warnings and errors in the IB netmod when configured with --enable-strict and --enable-g=all. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Pavan Balaji authored
No interfaces were added or removed. There were just source code changes. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Pavan Balaji authored
No Reviewer.
-
- 18 Jan, 2014 4 commits
-
-
Pavan Balaji authored
On some machines the iterations take unusually long. If they are getting to be larger than a predefined amount, break out of that loop. Fixes #1669. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Pavan Balaji authored
The code was unparseable to make any changes.
-
Simplify logic in compile wrapper scripts. Use configure substitutions where possible to better match pkg-config style. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov> Includes the following modifications by Pavan Balaji: Remove the PAC_COMPILER_SHLIB_FLAGS usage, instead of modifying the macro in confdb. The ordering of flags in mpicc and friends does not match that of pkg-config. This is because of two reasons. 1. pkg-config reorders flags when it outputs them. This requires us to manually adjust the flags in mpicc to match up, and is error prone. 2. mpicc and friends provide LDFLAGS before the user-specified flags, followed by the include and library directories. This is to make sure that the LDFLAGS are listed before the application source file. Reordering them to match pkg-config loses this flexibility. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Add rpath flags to pkg-config to match compiler wrappers. Fixes #1044 Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 16 Jan, 2014 3 commits
-
-
Rob Latham authored
Some datatype performance tests in the MPICH test suite fail: (perf/twovec, perf/nestvec, perf/nestvec2, perf/indexperf, perf/transp-datatype). This changeset introduces a few optimizations that operate on the dataloop representation to make it more performant. perf/indexperf should still fail under these changes. Original-author: Bill Gropp <wgropp@illinois.edu> See #1788, for which this resolves some but not all performance issues. Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
William Gropp authored
The test in test/mpi/perf/twovec made invalid assumptions about the performance of two MPI datatype creation routines. This is a hard test to get right, but this version is more likely to avoid falsely signalling an error.
-
Pavan Balaji authored
This was meant to test out the case when MPI_Test is not nonblocking. However, we ended up assuming that MPI_Win_lock will be nonblocking. That is not specified by the standard and might not be true. Commenting this out till be find a better way to test the original problem with MPI_Test. Fixes #1910. Signed-off-by:
Rajeev Thakur <thakur@mcs.anl.gov>
-
- 15 Jan, 2014 5 commits
-
-
William Gropp authored
Fix a transposition in use of fprintf and remove benign trailing blanks
-
William Gropp authored
Added mpivars to the programs known to Automake so that it will build and install it.
-
William Gropp authored
If we need to build a separate profiling library, make sure we add that to the link link.
-
William Gropp authored
This adds a program that should work with any MPI 3 implementation to show the control and performance variables defined at MPI_Init time.
-
William Gropp authored
The original code here was a bit subtle, so this fix adds a comment explaining the particular test and provides the fallback message when a datatype has no name associated with it.
-
- 13 Jan, 2014 2 commits
-
-
Junchao Zhang authored
When mpich is configured with ./configure --enable-coverage ..., we put -fprofile-arcs -ftest-coverage to CXXFLAGS. According to gcc manual, -fprofile-arcs -ftest-coverage can be replaced by --coverage. But unfortunately, I found though both add -lgcov when linking libmpichcxx.so, -lgcov is put to the end of the link command, which is too late. Adding -lgcov directly to LIBS will put -lgcov before -lc, which makes the linker correctly resolve symbol 'atexit' in linking. Fixes #2000 Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Junchao Zhang authored
Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-