Rewrite all synchronization routines.
We use new algorithms for RMA synchronization functions and RMA epochs. The old implementation uses a lazy-issuing algorithm, which queues up all operations and issues them at end. This forbid opportunites to do hardware RMA operations and can use up all memory resources when we queue up large number of operations. Here we use a new algorithm, which will initialize the synchonization at beginning, and issue operations as soon as the synchronization is finished. Signed-off-by: Pavan Balaji <firstname.lastname@example.org>
Showing with 862 additions and 1031 deletions