From 020731dc8e8391f189a72d8253a50cb31a82b55f Mon Sep 17 00:00:00 2001 From: John Jenkins Date: Mon, 16 Dec 2013 11:19:17 -0600 Subject: [PATCH] example program docs, added ignores for latex --- doc/.gitignore | 7 ++ doc/codes-best-practices.tex | 180 +++++++++++++++++++++++++---------- 2 files changed, 139 insertions(+), 48 deletions(-) create mode 100644 doc/.gitignore diff --git a/doc/.gitignore b/doc/.gitignore new file mode 100644 index 0000000..e7d8a4a --- /dev/null +++ b/doc/.gitignore @@ -0,0 +1,7 @@ +# latex files +*.aux +*.bbl +*.blg +*.log +# we don't want the output to be part of the repository +/codes-best-practices.pdf diff --git a/doc/codes-best-practices.tex b/doc/codes-best-practices.tex index a19f79c..046ec0a 100644 --- a/doc/codes-best-practices.tex +++ b/doc/codes-best-practices.tex @@ -479,28 +479,29 @@ section(s). \end{enumerate} \section{CODES Example Model} -Outline -- The Basics: -\subsection{The Basics} + +TODO: Standardize the namings for codes configuration, mapping, and model-net. + This is a simple CODES example to demonstrate the concepts described above. In -the example scenario, we have a certain number of storage servers where each -server has a network interface card (NIC) associated with it. The servers exchange -messages with their neighboring servers via their NIC card. When the neighboring -server receives the message, it sends another message to the sending server in -response. This process continues until a specific threshold is reached. - -Paragraph 2: John, can you please complete the following paragraph? I've added -some notes. - -Note: In this paragraph, we start off with technical aspects of the simulation. -Here we describe the LP types i.e. server LPs and NIC LPs. We then talk about -the local completion message and the remote completion messages sent by server -LPs. Local completion message is sent to the sending server when the message -leaves the sender NIC (Posting a message send at a server does not necessarily -mean that the message has been sent thats why we have a local completion -message). Remote message is the data sent to the destination server LP from -the sending LP. In this example, the remote message triggers another request -message from the receiving server LP (Refer to Listings \ref{snippet1}). +this scenario, we have a certain number of storage servers, identified +through indices $0,\ldots, n-1$ where each server has a network interface card +(NIC) associated with it. The servers exchange messages with their neighboring +server via their NIC card (i.e., server $i$ pings server $i+1$, rolling over the +index if necessary). When the neighboring server receives the message, it sends +an acknowledgement message to the sending server in response. Upon receiving the +acknowledgement, the sending server issues another message. This process continues until +some number of messages have been sent. For simplicity, it is assumed that each +server has a direct link to its neighbor, and no network congestion occurs due +to concurrent messages being sent. + +The model is relatively simple to simulate through the usage of ROSS. There are +two distinct LP types in the simulation: the server and the NIC. Refer to +Listings \ref{snippet1} for data structure definitions. The server LPs +are in charge of issuing/acknowledging the messages, while the NIC LPs +(implemented via CODES's model-net) transmit the data and inform their +corresponding servers upon completion. This LP decomposition strategy is +generally preferred for ROSS-based simulations: have single-purpose, simple LPs +representing logical system components. \begin{figure} \begin{lstlisting}[caption=Server state and event message struct, label=snippet1] @@ -523,28 +524,33 @@ struct svr_msg \end{lstlisting} \end{figure} -\subsection{CODES mapping} -The CODES mapping API transparently maps LP types to MPI ranks (Aka ROSS PE's). The -LP type and count can be specified through a configuration file. In this section, we -explain the configuration file of the CODES simple example followed by how the -CODES mapping API can be used in the example program. Figure \ref{snippet2} -shows the LP mapping config file for the CODES example. Multiple LP types are -specified in a single LP group (there can also be multiple LP groups in a config file). - -There is 1 server LP and 1 modelnet\_simplenet LP type in a group and this -combination is repeated 16 time (repetitions="16"). ROSS will assign the LPs to -the PEs (PEs is an abstraction for MPI rank in ROSS) by placing 1 server LP -then 1 modelnet\_simplenet LP a total of 16 times. This configuration is useful -if there is heavy communication involved between the server and -modelnet\_simplenet LP types, in which case ROSS will place them on the same PE -so that the communication between server and modelnet\_simplenet LPs will not involve remote -messages. +In this program, CODES is used in the following four ways: to provide +configuration utilities for the program, to logically separate and provide +lookup functionality for multiple LP types, to automate LP placement on KPs/PEs, +and to simplify/modularize the underlying network structure. The CODES +Configurator is used for the first use-case, the CODES Mapping API is used for +the second and third use-cases, and the CODES Model-Net API is used for the +fourth use-case. The following sections discuss these while covering necessary +ROSS-specific information. + +\subsection{CODES Configurator} + +Listing~\ref{snippet2} shows a stripped version of example.conf (see the file +for comments). The configuration format allows categories, and optionally +subgroups within the category, of key-value pairs for configuration. The LPGROUPS +listing defines the LP configuration and (described in +Section~\ref{subsec:codes_mapping}). The PARAMS category is used by both +CODES Mapping and Model-Net for configuration, providing both ROSS-specific and +network specific parameters. For instance, the message\_size field defines the +maximum event size used in ROSS for memory management. Of course, user-defined +categories can be used as well, which are used in this case to define the rounds +of communication and the size of each message. \begin{figure} \begin{lstlisting}[caption=example configuration file for CODES LP mapping, label=snippet2] LPGROUPS { - MODELNET_GRP + SERVERS { repetitions="16"; server="1"; @@ -559,9 +565,39 @@ PARAMS net_startup_ns="1.5"; net_bw_mbps="20000"; } +server_pings +{ + num_reqs="5"; + payload_sz="4096"; +} \end{lstlisting} \end{figure} + +\subsection{CODES Mapping} +\label{subsec:codes_mapping} + +The CODES mapping API transparently maps LP types to MPI ranks (Aka ROSS PE's). +The LP type and count can be specified through the CODES Configurator. In this +section, we focus on the CODES Mapping API as well as configuration. Refer again +to Listing~\ref{snippet2}. Multiple LP types are specified in a single LP group +(there can also be multiple LP groups in a config file). + +In Listing~\ref{snippet2}, there is 1 server LP and 1 modelnet\_simplenet LP +type in a group and this combination is repeated 16 time (repetitions="16"). +ROSS will assign the LPs to the PEs (PEs is an abstraction for MPI rank in ROSS) +by placing 1 server LP then 1 modelnet\_simplenet LP a total of 16 times. This +configuration is useful if there is heavy communication involved between the +server and modelnet\_simplenet LP types, in which case ROSS will place them on +the same PE so that the communication between server and modelnet\_simplenet LPs +will not involve remote messages. + +An important consideration when defining the configuration file is the way +Model-Net maps the network-layer LPs (the NICs in this example) and the upper +level LPs (e.g., the servers). Specifically, each NIC is mapped in a one-to-one +manner with the calling LP through the calling LP's group name, repetition +number, and number within the repetition. + After the initialization function calls of ROSS (tw\_init), the configuration file can be loaded in the example program using the calls in Figure \ref{snippet3}. Each LP type must register itself using \'lp\_type\_register\' @@ -605,7 +641,7 @@ static void svr_add_lp_type() \end{lstlisting} \end{figure} -The CODES mapping API provides ways to query information like number of LPs of +The CODES Mapping API provides ways to query information like number of LPs of a particular LP types, group to which a LP type belongs, repetitions in the group (For details see codes-base/codes/codes-mapping.h file). Figure \ref{snippet3} shows how to setup CODES mapping API with our CODES example and @@ -619,7 +655,7 @@ maintains a count of the number of remote messages it has sent and received as well as the number of local completion messages. For the server event message, we have four message types KICKOFF, REQ, ACK and -LOCAL. With a KICKOFF event, each LP sends a message to itself (the simulation +LOCAL. With a KICKOFF event, each LP sends a message to itself (the simulation begins from here). To avoid event ties, we add a small noise using the random number generator (See Section \ref{sec_kickoff}). The server LP state data structure and server message data structures are given in Figure \ref{snippet5}. The \`REQ\' @@ -652,24 +688,24 @@ static void svr_event(svr_state * ns, tw_bf * b, svr_msg * m, tw_lp * lp) \end{lstlisting} \end{figure} -\subsection{Model-net API} -Model-net is an abstraction layer that allow models to send messages -across components using different network transports. This is a +\subsection{Model-Net API} +Model-Net is an abstraction layer that allow models to send messages +across components using different network transports. This is a consistent API that can send messages across either torus, dragonfly, or simplenet network models without changing the higher level model code. In the CODES example, we use simple-net as the underlying plug-in for -model-net. The simple-net parameters are specified by the user in the config +Model-Net. The simple-net parameters are specified by the user in the config file (See Figure \ref{snippet2}). A call to `model\_net\_set\_params' sets up the model\-net parameters as given in the config file. -model\_net assumes that the caller already knows what LP it wants to deliver the +Model-Net assumes that the caller already knows what LP it wants to deliver the message to and how large the simulated message is. It carries two types of events (1) a remote event to be delivered to a higher level model LP (In the -example, the model-net LPs carry the remote event to the server LPs) and (2) a +example, the Model-Net LPs carry the remote event to the server LPs) and (2) a local event to be delivered to the caller once the message has been transmitted from the node (In the example, a local completion message is delivered to the -server LP once the model-net LP sends the message). Figure \ref{snippet6} shows +server LP once the Model-Net LP sends the message). Figure \ref{snippet6} shows how the server LP sends messages to the neighboring server using the model\-net LP. @@ -699,7 +735,55 @@ static void handle_kickoff_event(svr_state * ns, \end{figure} \subsection{Reverse computation} -John: Can you please fill in the details about the reverse handlers? + +ROSS has the capability for optimistic parallel simulation, but instead of +saving the state of each LP, they instead require users to perform \emph{reverse +computation}. That is, while the event messages are themselves preserved (until +the Global Virtual Time (GVT) algorithm renders the messages unneeded), the LP +state is not preserved. Hence, it is up to the simulation developer to provide +functionality to reverse the LP state, given the event to be reversed. ROSS +makes this simpler in that events will always be rolled back in exactly the +order they were applied. Note that ROSS also has both serial and parallel +conservative modes, so reverse computation may not be necessary if the +simulation is not computationally intense. + +For our example program, recall the ``forward'' event handlers. They perform the +following: +\begin{description} + \item [Kickoff] send a message to the peer server, and increment sender LP's + count of sent messages. + \item [Request (received from peer server)] increment receiver count of + received messages, and send an acknowledgement to the sender. + \item [Acknowledgement (received from message receiver)] send the next + message to the receiver and increment messages sent count. Set a flag + indicating whether a message has been sent. + \item [Local model-net callback] increment the local model-net + received messages count. +\end{description} + +In terms of LP state, the four operations are simply modifying counts. Hence, +the ``reverse'' event handlers need to merely roll back those changes: +\begin{description} + \item [Kickoff] decrement sender LP's count of sent messages. + \item [Request (received from peer server)] decrement receiver count of + received messages. + \item [Acknowledgement (received from message receiver)] decrement messages + sent count if flag indicating a message has been sent has not been + set. + \item [Local model-net callback] decrement the local model-net + received messages count. +\end{description} + +For more complex LP states (such as maintaining queues), reverse event +processing becomes similarly more complex. Other sections of this document +highlight strategies of dealing with those. + +Note that ROSS maintains the ``lineage'' of events currently stored, which +enables ROSS to roll back the messages in the order they were originally +processed. This greatly simplifies the reverse computation process: the LP state +when reversing the effects of a particular event is exactly the state that +resulted from processing the event in the first place (of course, unless the +event handlers are buggy). \section{TODO} -- 2.26.2