- 20 Apr, 2015 1 commit
-
-
Xin Zhao authored
'stream_offset' is used to specify the starting position (on target window) of the current streaming unit in ACC-like operations. It is originally put in the RMA packet struct, which potentially increases the size of CH3 packet size. In this patch, we move 'stream_offset' out of the RMA packet as follows: 1. when target data is basic datatype, we use 'stream_offset' and the starting address for the entire operation to calculate the starting address for current streaming unit, and rewrite 'addr' in RMA packet with that value; 2. when target data is derived datatype, we cannot do the same thing as basic datatype because the target needs to know both the starting address for the entire operation and the starting address for the current streaming unit. Therefore, we send 'stream_offset' separately to the target side. Signed-off-by:
Min Si <msi@il.is.s.u-tokyo.ac.jp> Signed-off-by:
Antonio J. Pena <apenya@mcs.anl.gov>
-
- 09 Apr, 2015 1 commit
-
-
Antonio Pena Monferrer authored
The datatype size was checked outside the appropriate branches in a couple of places Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 04 Mar, 2015 8 commits
-
-
In MPI standard, predefined datatype is called as basic type. It is better to make the name same with the standard in the code. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
On target side, we always allocate a SRBuf with 256K, which equals to the size of stream unit, to receive ACC/GACC data. Note that in MPIDI_CH3U_Request_load_recv_iov(), for ACC/GACC operations, since we already use SRBuf to receive the data at beginning, we will not use another SRBuf here, in order to avoid one more memory copy. Also, we pass the stream_offset in the current RMA packet to the request struct (when receiving is not finished) and do_accumulate_op function (when receiving is finished). Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
This patch adds req types for FOP operation, and calls FOP req handler after SRBuf is unpacked. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Add stream_offset area into ACC-related packets and request struct to remember current stream unit's starting position in the entire target data. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
The original implementation of RMA does not consider pair basic types (e.g. MPI_FLOAT_INT, MPI_DOUBLE_INT). It only works correctly with builtin datatypes (e.g. MPI_INT, MPI_FLOAT). This patch makes the RMA work correctly with pair basic types. The bug is that: (1) when performing the ACC computation, the original implementation uses 'eltype' in the datatype structure, which is set when all basic elements in this datatype have the same builtin datatype. When basic elements have different builtin datatypes, like pair datatypes, the 'eltype' is set to MPI_DATATYPE_NULL. This makes the ACC computation be unable to work with pair types; (2) for all basic type of data, the original implementation assumes that they are all contiguous and issues them in an unpacked manner with length of data size (count*type_size). This is incorrect for pair datatypes, because most pair datatypes are non-contiguous (type_extent != type_size). In the previous patch, we already made 'eltype' to store basic type instead of builtin type. In this patch, we fixed this bug by (1) modify ACC computation to treat 'eltype' as basic type; (2) For non-contiguous basic type data, we use the noncontig API so that it will be issued in a packed manner. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is used to tell the target if the response packet of current GET, GACC and FOP should use IMMED packet type. We use IMMED packet type only when origin/target/result datatypes are all basic types. Since the target does not know origin/result datatypes, origin process needs to set a flag to inform the target. However, this usage is redundant for GACC and FOP packets. The reason is that, when we use IMMED packet type for GACC/FOP packets, origin/target/result datatypes must be basic types, in such case, we must use IMMED packet type for response packets as well, and usage of MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP and related code is not necessary. In short, flag MPIDI_CH3_PKT_FLAG_RMA_IMMED_RESP is useful only for GET operation. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 03 Mar, 2015 1 commit
-
-
Xin Zhao authored
No reviewer.
-
- 13 Feb, 2015 6 commits
-
-
Xin Zhao authored
For GET-like RMA packets and response packets (GACC, GET, FOP, CAS, GACC_RESP, GET_RESP, FOP_RESP, CAS_RESP), originally we carry source_win_handle in packet struct in order to locate window handle on origin side in the packet handler of response packets. However, this is not necessary because source_win_handle can be stored in the request on the origin side. This patch delete source_win_handle from those packets to reduce the size of packet union. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
do_accumulate_op() does more comprehensive work on ACC computation than OP function. For example, MPI_REPLACE is not defined as predefined computation and therefore not handled by OP function, but it is safely handled in do_accumulate_op(). This patch replace OP function with do_accumulate_op() on target side. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
In this patch, we replace one argument of function finish_op_on_target, "packet(op) type", with "has_response_data". Since finish_op_on_target does not care what specific packet(op) type it is processing on, but only cares about if the current op has response data (like GET/GACC), changing the argument in this way can simplify the code by avoiding acquiring packet(op) type everytime before calling finish_op_on_target. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Originally we add "immed_data" and "immed_len" areas to RMA packets, in order to piggyback small amount of data with packet header to reduce number of packets (Note that "immed_len" is necessary when the piggybacked data is not the entire data). However, those areas potentially increase the packet union size and worsen the two-sided communication. This patch fixes this issue. In this patch, we remove "immed_data" and "immed_len" from normal "MPIDI_CH3_Pkt_XXX_t" operation type (e.g. MPIDI_CH3_Pkt_put_t), and we introduce new "MPIDI_CH3_Pkt_XXX_immed_t" packt type for each operation (e.g. MPIDI_CH3_Pkt_put_immed_t). "MPIDI_CH3_Pkt_XXX_immed_t" is used when (1) both origin and target are basic datatypes, AND, (2) the data to be sent can be entirely fit into the header. By doing this, "MPIDI_CH3_Pkt_XXX_immed_t" needs "immed_data" area but can drop "immed_len" area. Also, since it only works with basic target datatype, it can drop "dataloop_size" area as well. All operations that do not satisfy (1) or (2) will use normal "MPIDI_CH3_Pkt_XXX_t" type. Originally we always piggyback FOP data into the packet header, which makes the packet size too large. In this patch we split the FOP operaton into IMMED packets and normal packets. Because CAS only work with 2 basic datatype and non-complex elements, the data amount is relatively small, we always piggyback the data with packet header and only use "MPIDI_CH3_Pkt_XXX_immed_t" packet type for CAS. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Originally we added lock_type and origin_rank areas in RMA packet, in order to piggyback passive lock request with RMA operations. However, those areas potentially enlarged the packet union size, and actually they are not necessary and can be completetly avoided. "Lock_type" is used to remember what types of lock (shared or exclusive) the origin wants to acquire on the target. To remove it from RMA packet, we use flags (already exists in RMA packet) to remember such information. "Origin_rank" is used to remember which origin has sent lock request to the target, so that when the lock is granted to this origin later, the target can send ack to that origin. Actually the target does not need to store origin_rank but can only store origin_vc, which is known from progress engine on target side. Therefore, we can completely remove origin_rank from RMA packet. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 08 Feb, 2015 1 commit
-
-
Xin Zhao authored
FOP, CAS and GACC are atomic "read-modify-write" operations, which means when the target window is defined on a SHM region, we need inter-process lock to guarantee the atomicity of the entire "read+OP". The current implementation is correct for SHM-based RMA operations, but not correct for AM-based RMA operations: for SHM-based operations, it protects the entire "read+OP", but for AM-based operations, it only protects the "OP" part. This patch fixes this issue by protecting the memory copy to temporary buffer and computation together for AM-based operations. Fix ticket 2226 Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 16 Dec, 2014 9 commits
-
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
The behavior of UNLOCK_ACK flag is exactly the same with the behavior of FLUSH_ACK, so here we just delete UNLOCK_ACK flag and use FLUSH_ACK flag for all FLUSH ACK packets. No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
No reviewer.
-
Xin Zhao authored
Because we will send different kinds of LOCK ACKs (not just LOCK_GRANTED, but maybe LOCK_DISCARDED, for example), so naming related packets and function as "LOCK_GRANTED" is not proper anymore. Here we rename them to "LOCK_ACK". No reviewer.
-
Xin Zhao authored
Use int instead of size_t in RMA pkt header to reduce packet size. No reviewer.
-
Xin Zhao authored
In this patch we allow GET/GACC response packets to piggyback some IMMED data, just like what we did for PUT/GACC/FOP/CAS packets. No reviewer.
-
Xin Zhao authored
Originally we only allows LOCK request to be piggybacked with small RMA operations (all data can be fit in packet header). This brings communication overhead for larger operations since origin side needs to wait for the LOCK ACK before it can transmit data to the target. In this patch we add support of piggybacking LOCK with RMA operations with arbitrary size. Note that (1) this only works with basic datatypes; (2) if the LOCK cannot be satisfied, we temporarily buffer this operation on the target side. No reviewer.
-
- 13 Nov, 2014 3 commits
-
-
Xin Zhao authored
ReqHandler_GaccumLikeSendComplete is used for GACC-like operations, including GACC, CAS and FOP. Here we split it into following three functions: ReqHandler_GaccumSendComplete ReqHandler_CASSendComplete ReqHandler_FOPSendComplete It is convenient for us to add different actions in future for those three kinds of operations. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Here we wrap up common action when one RMA op is finished on target into a function to make code structure cleaner. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
When operation pending list and request lists are all empty, FLUSH message needs to be sent by origin only when origin issued PUT/ACC operations since the last synchronization calls, otherwise origin does not need to issue FLUSH at all and does not need to wait for FLUSH ACK message. Similiarly, origin waits for ACK of UNLOCK message only when origin issued PUT/ACC operations since the last synchronization calls. However, UNLOCK message always needs to be sent out because origin needs to unlock the target process. This patch avoids issuing unnecessary FLUSH / FLUSH ACK / UNLOCK ACK messages. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
- 04 Nov, 2014 2 commits
-
-
Min Si authored
There are two request handlers used when receiving data: (1) OnDataAvail, which is triggered when data is arrived; (2) OnFinal, which is triggered when receiving data is finished; In progress engine, only OnDataAvail is triggered when a request is completed. The upper ch3 layer should change OnDataAvail to OnFinal when the coming receiving data will complete the request. However, in the original implementation, when receiving multiple segments for a large receive data, the OnDataAvail was reset to 0 at the last segment hence the final action was lost. This patch fixed this bug. In RMA target put/acc/gacc packet handlers, OnDataAvail was reset to OnFinal function if OnDataAvail is 0 due to this bug. This patch also rewrites this part so that packet handlers only sets proper OnFinal handler at beginning and let the receiving data function change OnDataAvail to OnFinal at the last segment. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
Min Si authored
There are two requests associated with each request-based operation: one normal internal request (req) and one newly added user request (ureq). We return ureq to user when request-based op call returns. The ureq is initialized with completion counter (CC) to 1 and ref count to 2 (one is referenced by CH3 and another is referenced by user). If the corresponding op can be finished immediately in CH3, the runtime will complete ureq in CH3, and let user's MPI_Wait/Test to destroy ureq. If corresponding op cannot be finished immediately, we will first increment ref count to 3 (because now there are three places needed to reference ureq: user, CH3, progress engine). Progress engine will complete ureq when op is completed, then CH3 will release its reference during garbage collection, finally user's MPI_Wait/Test will destroy ureq. The ureq can be completed in following three ways: 1. If op is issued and completed immediately in CH3 (req is NULL), we just complete ureq before free op. 2. If op is issued but not completed, we remember the ureq handler in req and specify OnDataAvail / OnFinal handlers in req to a newly added request handler, which will complete user reqeust. The handler is triggered at three places: 2-a. when progress engine completes a put/acc req; 2-b. when get/getacc handler completes a get/getacc req; 2-c. when progress engine completes a get/getacc req; 3. If op is not issued (i.e., wait for lock granted), the 2nd way will be eventually performed when such op is issued by progress engine. Signed-off-by:
Xin Zhao <xinzhao3@illinois.edu>
-
- 03 Nov, 2014 8 commits
-
-
Xin Zhao authored
Add some original RMA PVARs back to the new RMA infrastructure, including timing of packet handlers, op allocation and setting, window creation, etc. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
We made a huge change to RMA infrastructure and a lot of old code can be droped, including separate handlers for lock-op-unlock, ACCUM_IMMED specific code, O(p) data structure code, code of lazy issuing, etc. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
1. Piggyback LOCK request with first IMMED operation. When we see an IMMED operation, we can always piggyback LOCK request with that operation to reduce one sync message of single LOCK request. When packet header of that operation is received on target, we will try to acquire the lock and perform that operation. The target either piggybacks LOCK_GRANTED message with the response packet (if available), or sends a single LOCK_GRANTED message back to origin. 2. Rewrite code of manage lock queue. When the lock request cannot be satisfied on target, we need to buffer that lock request on target. All we need to do is enqueuing the packet header, which contains all information we need after lock is granted. When the current lock is released, the runtime will goes over the lock queue and grant the lock to the next available request. After lock is granted, the runtime just trigger the packet handler for the second time. 3. Release lock on target side if piggybacking with UNLOCK. If there are active-message operations to be issued, we piggyback a UNLOCK flag with the last operation. When the target recieves it, it will release the current lock and grant the lock to the next process. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
For FOP operation, all data can be fit into the packet header, so on origin side we do not need to send separate data packets, and on target side we do not need request handler, only packet handler is needed. Similar with FOP response packet, we can receive all data in FOP resp packet handler. This patch delete the request handler on target side and simplify packet handler on target / origin side. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
We add a IMMED data area (16 bytes by default) in packet header which will contains as much origin data as possible. If origin can put all data in packet header, then it no longer needs to send separate data packet. When target recieves the packet header, it will first copy data out from the IMMED data area. If there is still more data coming, it continues to receive following packets; if all data is included in header, then recieving is done. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
During PSCW, when there are active-message operations to be issued in Win_complete, we piggback a AT_COMPLETE flag with it so that when target receives it, it can decrement a counter on target side and detect completion when target counter reaches zero. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
When the origin wants to do a FLUSH sync, if there are active-message operations that are going to be issued, we piggback the FLUSH message with the last operation; if no such operations, we just send a single FLUSH packet. If the last operation is a write op (PUT, ACC) or only a single FLUSH packet is sent, after target recieves it, target will send back a single FLUSH_ACK packet; if the last operation contains a read action (GET, GACC, FOP, CAS), after target receiveds it, target will piggback a FLUSH_ACK flag with the response packet. After origin receives the FLUSH_ACK packet or response packet with FLUSH_ACK flag, it will decrement the counter which indicates number of outgoing sync messages (FLUSH / UNLOCK). When that counter reaches zero, origin can know that remote completion is achieved. Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-
Xin Zhao authored
Separate final request handler of PUT, ACC, GACC into three. Separate derived DT request handler of ACC and GACC into two. Renaming request handlers as follows: (1) Normal request handler: it is triggered on target side when all data from origin is received. It includes: ReqHandler_PutRecvComplete --- for PUT ReqHandler_AccumRecvComplete --- for ACC ReqHandler_GaccumRecvComplete --- for GACC (2) Derived DT request handler: it is triggered on target side when all derived DT info is recieved. It includes: ReqHandler_PutDerivedDTRecvComplete --- for PUT ReqHandler_AccumDerivedDTRecvComplete --- for ACC ReqHandler_GaccumDerivedDTRecvComplete --- for GACC (3) Reponse request handler: it is triggered on target side when sending back process is finished in GET-like operations. It includes: ReqHandler_GetSendComplete --- for GET ReqHandler_GaccumLikeSendComplete --- for GACC, FOP, CAS Signed-off-by:
Pavan Balaji <balaji@anl.gov>
-