- 22 Jun, 2015 1 commit
-
-
Rob Latham authored
The ongoing march towards 64-bit clean continues. Address areas where large product of two ints might have overflowed. Ref: #1767 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 26 Feb, 2015 1 commit
-
-
Signed-off-by:
Rob Latham <robl@mcs.anl.gov>
-
- 23 Feb, 2015 1 commit
-
-
Wesley Bland authored
When an operation is completed in an MPIC function, it should no longer have the error bits set since that information has now been captured in the errflag. This prevents unnecessary assert failures. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 18 Feb, 2015 1 commit
-
-
Igor Ivanov authored
This issue was added in commit [54362c00 ]. Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 13 Feb, 2015 1 commit
-
-
Wesley Bland authored
Requests that aren't for receive operations don't have anything in them, so trying to process them only generates valgrind warnings with potentially unsafe behavior. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 31 Jan, 2015 1 commit
-
-
Wesley Bland authored
The previous commits didn't take into account empty requests when extracting the status. It also introduced a dumb bug that didn't get tested first about a null pointer check. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 30 Jan, 2015 3 commits
-
-
Wesley Bland authored
Part of converting the NBC code to use the MPIC_* functions requires an MPIC_Issend function to exist. This adds it. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Wesley Bland authored
The MPIC helper functions have been using MPI_Comm and MPI_Request objects instead of their MPID_* counterparts. This leads to a bunch of unnecessary conversions back and forth between the two types of objects and makes the work incompatible with other parts of the codebase (non-blocking collectives for instance). This patch converts all of the MPIC_* functions to use MPID_Comm and MPID_Request and changes all of the collective calls to use them now too. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Wesley Bland authored
The collective helper functions generally have an errflag that is used when a failure is detected to allow the collective to continue while also communicating that a failure occurred. That flag is now included as a parameter for MPIC_Wait. The rest of this commit is the refactoring necessary in the rest of the helper functions to support the change. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 27 Jan, 2015 1 commit
-
-
Kenneth Raffenetti authored
The tag for send was ignored and recvtag incorrectly used in its place. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 22 Jan, 2015 1 commit
-
-
Huiwei Lu authored
When process fails, fault tolerance scheme takes a different path to deal with MPI object reference counts than the existing one. Some reference counts were not properly set in FT path so when configured with --enable-g=all, some ft tests will show leaked context id, dirty COMM, GROUP and REQUEST objects and so on when exit. This patch fixes ft/shrink and ft/agree with "--enable-g=all". Stack allocated objects of requests, communicators and groups will be freed by FT. Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
- 12 Nov, 2014 3 commits
-
-
Wesley Bland authored
The MPI collectives get and set the errflag used by the collective helper functions (MPIC_*). The possible values of the errflag changed, so the collective functions need to appropriately set this value using either MPIR_ERR_NONE (MPI_SUCCESS), MPIR_ERR_PROC_FAILED (MPIX_ERR_PROC_FAILED), or MPIR_ERR_OTHER (MPI_ERR_OTHER). This should allow collectives to correctly report process failures when they occur now, fixing the FT tests that use collectives (see #1945). Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Wesley Bland authored
The errflag value being used in the MPIC helper functions only propagated whether or not an error occurred. It did not contain any information about what kind of error occurred, which made returning the correct error code after a process failure impossible. This patch converts the binary value to an enum with three options: MPIR_ERR_NONE MPIR_ERR_PROC_FAILED MPIR_ERR_OTHER The original use of TRUE and false maps to MPIR_ERR_NONE and MPIR_ERR_OTHER. MPIR_ERR_PROC_FAILED indicates that the error occurred because of a process failure. It uses the new bit set aside from the tag space to track such information between processes. This change required modifying lots of function signatures and type declarations to use the new enum type, but these are actually not very intrusive changes and shouldn't be a problem going forward. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
Wesley Bland authored
We need to take another bit from the tag space to specify the difference between a generic failure and a process failure. This patch modifies the macros to handle this situation. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 23 Oct, 2014 2 commits
-
-
Wesley Bland authored
-
Wesley Bland authored
Back in the 3.1 series, we made the FT versions of all of the MPIC functions default. However, we never changed the names of all of the states. This removes the extra state names. No reviewer.
-
- 22 Oct, 2014 1 commit
-
-
Wesley Bland authored
The macro that called the bcast function left out an underscore in the mpi_errno return value. This caused the test to always return MPI_ERR_OTHER instead of the value being returned by the underlying bcast function. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 21 Oct, 2014 1 commit
-
-
Wesley Bland authored
The previous commit only adjusted the tag checking proceedure for MPIC_Recv, but it should be the same for MPIC_Sendrecv and MPIC_Sendrecv_replace. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 20 Oct, 2014 1 commit
-
-
Wesley Bland authored
Tags were sometimes being tested when the communication call had already failed, leading to bad asserts later due to uninitialized values. Sometimes, these results would have ended up with an error anyway, but sometimes not. Signed-off-by:
Huiwei Lu <huiweilu@mcs.anl.gov>
-
- 14 Oct, 2014 1 commit
-
-
Wesley Bland authored
This CVAR was deprecated in MPICH 3.1 and will now be removed for MPICH 3.2. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 10 Oct, 2014 1 commit
-
-
Fixed wrong parameter check condition for MPI_Iallgather and MPI_Iallgatherv -1 is valid value for sendcount in case MPI_IN_PLACE MPI spec says: The in place option for intracommunicators is specified by passing the value MPI_IN_PLACE to the argument sendbuf at all processes. In such a case, sendcount and sendtype are ignored, and the input data of each process is assumed to be in the area where that process would receive its own contribution to the receive buffer. Signed-off-by:
Igor Ivanov <Igor.Ivanov@itseez.com> Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 01 Oct, 2014 1 commit
-
-
Junchao Zhang authored
The warning message is: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] Fixes #2167 Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 31 Jul, 2014 3 commits
-
-
Wesley Bland authored
If a process is dead, collectives still do all of the communictaions to prevent a deadlock. However, if we just skip the part where the data is updated in the allreduce_group function, we can let it be slightly more resilient to failures and possibly even produce a correct answer in the presence of a failure. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Wesley Bland authored
Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
Wesley Bland authored
Adds a parameter to MPID_Comm_valid_ptr to take a second parameter that will either cause the macro to ignore the revoke flag or not. Signed-off-by:
Junchao Zhang <jczhang@mcs.anl.gov>
-
- 19 Jul, 2014 1 commit
-
-
Su Huang authored
Added the check on count and datatype on the following functions: - MPI_Iallreduce - MPI_Ialltoall - MPI_Ibcast - MPI_Igather - MPI_Ireduce - MPI_Iscan (ibm) D198793 Fixes #2139 Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com> Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 07 Jul, 2014 3 commits
-
-
Moved the weak,alias attribute declarations from header files to the implementation. Complies with the requirement that alias targets are defined in the same compilation unit. Fixes #2002 Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
Su Huang authored
(ibm) D198243 Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com> Signed-off-by:
Wesley Bland <wbland@anl.gov>
-
Wesley Bland authored
If the size of the input buffers is 0, don't bother to check if they are aliased. Fixes #2124 Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 26 Jun, 2014 1 commit
-
-
The patch provides a better implementation where each group does a intra-reduce concurrently, exchanges the reduce result, and broadcasts in the group. The current implementation had a problem of serializing intra-reduces of each group. Fixes ticket #2074. Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 10 Jun, 2014 1 commit
-
-
Wesley Bland authored
If the user isn't using MPI_IN_PLACE when they should, this check will do a better job of warning them about it. See #2049 Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 18 Apr, 2014 1 commit
-
-
Wesley Bland authored
In the MPIC_Sendrecv functions, the status object should always be defined since we use it internally. This won't have any impact on performance since the default is always to have FT collectives turned on anyway, but it will prevent a crash when someone overwrites that default. Fixes #2026 Signed-off-by:
Sangmin Seo <sseo@anl.gov>
-
- 11 Apr, 2014 1 commit
-
-
Antonio J. Pena authored
Fixes the following warning when --enable-fast is enabled: src/mpi/coll/bcast.c:1012:23: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] See #1966. Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
- 01 Apr, 2014 1 commit
-
-
Maintain a list of files that go into each library. If a particular binding is not enabled, the list variable still exists, but will just be empty. This simplifies the management of which files/symbols go into which library. Move all MPI_ symbols to the libmpi library and all other symbols to the libpmpi library. All Fortran 77 symbols go into libmpif77.so, while C symbols go into libmpi.so. There are some exceptions, such as status_f2c, which are handled by the Fortran code but used in C. Our Fortran 90 build only creates a few symbols and uses the f77 symbols for everything else. These few symbols go into libmpifort.so. Also update compiler wrappers to link to correct libraries. mpif77 should now link with libmpif77. mpif90 links with both libmpifort and libmpif77, since our F90 build still keeps the core Fortran library symbols in libf77. We completely ignored the F77 library earlier. This was OK because all of the Fortran symbols were ending up in libmpi. Now that we have separated out the symbols to the right library, we now need to link to libmpif77 as well. Also added inter-library dependencies. libmpi has a dependency on several internal libraries: libmpl, libopa. libmpicxx did not have a dependency on libmpi, added. libmpif77 did not have a dependency on libmpi, added. libmpifort did not have a dependency on libmpi, added. This dependency model is sufficient for C and F77, but not for C++ and F90. The C and F77 libraries contain all the symbols the application relies on, but the F90 and C++ libraries don't. In the case of F90, symbols such as mpi_bcast are missing and are borrowed from the F77 library. In the case of C++, mpicxx.h contains calls directly to C functions (such as MPI_Reduce_local), which get embedded into the application. Fixes #2023. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 26 Feb, 2014 1 commit
-
-
Pavan Balaji authored
Simply ran the new ./maint/check_copyright.bash script. Fixes #2032. Signed-off-by:
Ken Raffenetti <raffenet@mcs.anl.gov>
-
- 07 Jan, 2014 2 commits
-
-
Pavan Balaji authored
The current implementation of SMP-awareness in MPI_Barrier was in MPIR_Barrier_impl. This makes the default implementation of barrier SMP-aware. However, if a device overrides barrier and then calls back the default implementation through MPIR_Barrier, it can no longer take advantage of SMP-awareness. This patch moves the SMP-aware implementation to MPIR_Barrier_intra, which is called by MPIR_Barrier_impl through MPIR_Barrier, in the default implementation. See #1957. Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com>
-
Pavan Balaji authored
Use MPIR_Bcast as an inline function, and let the compiler do its job, instead of duplicating the code. Signed-off-by:
Michael Blocksome <blocksom@us.ibm.com>
-
- 31 Dec, 2013 3 commits
-
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@mcs.anl.gov>
-