Commit b7538043 authored by David Goodell's avatar David Goodell
Browse files

[svn-r3794] Roll back r3749 in order to fix ticket #397. The intent of r3749 is still

valid, but the reduce and allreduce implementations aren't currently written
correctly to deal with non-commutative operations this way.

No reviewer (old code).
parent 5f509ca2
......@@ -71,8 +71,13 @@ MPIR_Op_check_dtype_fn *MPIR_Op_check_dtype_table[] = {
Cost = (2.floor(lgp)+2).alpha + (2.((p-1)/p) + 2).n.beta + n.(1+(p-1)/p).gamma
For short messages and for count < pof2 we use a recursive doubling
algorithm (similar to the one in MPI_Allgather).
For short messages, for user-defined ops, and for count < pof2
we use a recursive doubling algorithm (similar to the one in
MPI_Allgather). We use this algorithm in the case of user-defined ops
because in this case derived datatypes are allowed, and the user
could pass basic datatypes on one process and derived on another as
long as the type maps are the same. Breaking up derived datatypes
to do the reduce-scatter is tricky.
Cost = lgp.alpha + n.lgp.beta + n.lgp.gamma
......@@ -255,10 +260,18 @@ int MPIR_Allreduce (
else /* rank >= 2*rem */
newrank = rank - rem;
/* If count is less than pof2, use recursive doubling algorithm.
Otherwise do a reduce-scatter followed by allgather. */
/* If op is user-defined or count is less than pof2, use
recursive doubling algorithm. Otherwise do a reduce-scatter
followed by allgather. (If op is user-defined,
derived datatypes are allowed and the user could pass basic
datatypes on one process and derived on another as long as
the type maps are the same. Breaking up derived
datatypes to do the reduce-scatter is tricky, therefore
using recursive doubling in that case.) */
if (newrank != -1) {
if ((count*type_size <= MPIR_ALLREDUCE_SHORT_MSG) ||
(HANDLE_GET_KIND(op) != HANDLE_KIND_BUILTIN) ||
(count < pof2)) { /* use recursive doubling */
mask = 0x1;
while (mask < pof2) {
......
......@@ -653,7 +653,7 @@ fn_fail:
Algorithm: MPI_Reduce
For long messages and if count >= pof2 (where
For long messages and for builtin ops and if count >= pof2 (where
pof2 is the nearest power-of-two less than or equal to the number
of processes), we use Rabenseifner's algorithm (see
http://www.hlrs.de/organization/par/services/models/mpi/myreduce.html ).
......@@ -682,11 +682,22 @@ fn_fail:
n.(1+(p-1)/p).gamma
For short messages or count < pof2, we use a binomial tree algorithm.
For short messages, user-defined ops, and count < pof2, we use a
binomial tree algorithm for both short and long messages.
Cost = lgp.alpha + n.lgp.beta + n.lgp.gamma
We use the binomial tree algorithm in the case of user-defined ops
because in this case derived datatypes are allowed, and the user
could pass basic datatypes on one process and derived on another as
long as the type maps are the same. Breaking up derived datatypes
to do the reduce-scatter is tricky.
FIXME: Per the MPI-2.1 standard this case is not possible. We
should be able to use the reduce-scatter/gather approach as long as
count >= pof2. [goodell@ 2009-01-21]
Possible improvements:
End Algorithm: MPI_Reduce
......@@ -744,7 +755,8 @@ int MPIR_Reduce (
/* check if multiple threads are calling this collective function */
MPIDU_ERR_CHECK_MULTIPLE_THREADS_ENTER( comm_ptr );
if ((count*type_size > MPIR_REDUCE_SHORT_MSG) && (count >= pof2)) {
if ((count*type_size > MPIR_REDUCE_SHORT_MSG) &&
(HANDLE_GET_KIND(op) == HANDLE_KIND_BUILTIN) && (count >= pof2)) {
/* do a reduce-scatter followed by gather to root. */
mpi_errno = MPIR_Reduce_redscat_gather(sendbuf, recvbuf, count, datatype, op, root, comm_ptr);
if (mpi_errno) MPIU_ERR_POP(mpi_errno);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment