Commit 7c890ab2 authored by Committed by Pavan BalajiBrowse files
Modify SHM ACC/GACC to avoid allocate large buffer.
The original implementation of ACC/GACC on SHM first allocates a temporary buffer which has the same data layout as the target data, copies the entire origin data to that temporary buffer, and then performs the ACC computation between the temporary buffer and the target buffer. The temporary buffer can use potentially large amount of memory. This patch fixes this issue as follows: (1) SHM ACC/GACC routines directly call do_accumulate_op() function, which requires the origin data to be in a 'packed manner'; (2) if the origin data is basic type, we directly perform do_accumulate_op() between origin buffer and target buffer; if the origin data is derived, we stream the origin data by copying partial of origin data into a packed streaming buffer and performing do_accumulate_op() between the streaming buffer and target buffer each time. Signed-off-by: Pavan Balaji <firstname.lastname@example.org>
Showing with 146 additions and 250 deletions