Commit 5e5a56f6 by David Goodell

### [svn-r3749] Use Rabenseifner's Algorithm for user defined reduction operators...

```[svn-r3749] Use Rabenseifner's Algorithm for user defined reduction operators in MPI_Reduce and MPI_Allreduce.

Reviewed by thakur@.```
parent c6cb095b
 ... ... @@ -71,13 +71,8 @@ MPIR_Op_check_dtype_fn *MPIR_Op_check_dtype_table[] = { Cost = (2.floor(lgp)+2).alpha + (2.((p-1)/p) + 2).n.beta + n.(1+(p-1)/p).gamma For short messages, for user-defined ops, and for count < pof2 we use a recursive doubling algorithm (similar to the one in MPI_Allgather). We use this algorithm in the case of user-defined ops because in this case derived datatypes are allowed, and the user could pass basic datatypes on one process and derived on another as long as the type maps are the same. Breaking up derived datatypes to do the reduce-scatter is tricky. For short messages and for count < pof2 we use a recursive doubling algorithm (similar to the one in MPI_Allgather). Cost = lgp.alpha + n.lgp.beta + n.lgp.gamma ... ... @@ -260,18 +255,10 @@ int MPIR_Allreduce ( else /* rank >= 2*rem */ newrank = rank - rem; /* If op is user-defined or count is less than pof2, use recursive doubling algorithm. Otherwise do a reduce-scatter followed by allgather. (If op is user-defined, derived datatypes are allowed and the user could pass basic datatypes on one process and derived on another as long as the type maps are the same. Breaking up derived datatypes to do the reduce-scatter is tricky, therefore using recursive doubling in that case.) */ /* If count is less than pof2, use recursive doubling algorithm. Otherwise do a reduce-scatter followed by allgather. */ if (newrank != -1) { if ((count*type_size <= MPIR_ALLREDUCE_SHORT_MSG) || (HANDLE_GET_KIND(op) != HANDLE_KIND_BUILTIN) || (count < pof2)) { /* use recursive doubling */ mask = 0x1; while (mask < pof2) { ... ...
 ... ... @@ -653,7 +653,7 @@ fn_fail: Algorithm: MPI_Reduce For long messages and for builtin ops and if count >= pof2 (where For long messages and if count >= pof2 (where pof2 is the nearest power-of-two less than or equal to the number of processes), we use Rabenseifner's algorithm (see http://www.hlrs.de/organization/par/services/models/mpi/myreduce.html ). ... ... @@ -682,22 +682,11 @@ fn_fail: n.(1+(p-1)/p).gamma For short messages, user-defined ops, and count < pof2, we use a binomial tree algorithm for both short and long messages. For short messages or count < pof2, we use a binomial tree algorithm. Cost = lgp.alpha + n.lgp.beta + n.lgp.gamma We use the binomial tree algorithm in the case of user-defined ops because in this case derived datatypes are allowed, and the user could pass basic datatypes on one process and derived on another as long as the type maps are the same. Breaking up derived datatypes to do the reduce-scatter is tricky. FIXME: Per the MPI-2.1 standard this case is not possible. We should be able to use the reduce-scatter/gather approach as long as count >= pof2. [goodell@ 2009-01-21] Possible improvements: End Algorithm: MPI_Reduce ... ... @@ -755,8 +744,7 @@ int MPIR_Reduce ( /* check if multiple threads are calling this collective function */ MPIDU_ERR_CHECK_MULTIPLE_THREADS_ENTER( comm_ptr ); if ((count*type_size > MPIR_REDUCE_SHORT_MSG) && (HANDLE_GET_KIND(op) == HANDLE_KIND_BUILTIN) && (count >= pof2)) { if ((count*type_size > MPIR_REDUCE_SHORT_MSG) && (count >= pof2)) { /* do a reduce-scatter followed by gather to root. */ mpi_errno = MPIR_Reduce_redscat_gather(sendbuf, recvbuf, count, datatype, op, root, comm_ptr); if (mpi_errno) MPIU_ERR_POP(mpi_errno); ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!