CODES Network Models
model-net is a set of networking APIs and models intended to provide transparent message passing support between LPs, allowing the underlying network LPs to do the work of routing while user LPs model their applications. It consists of a number of both simple and complex network models as well as configuration utilities and communication APIs. A somewhat stale overview is also given at src/networks/model-net/doc/README.
Components of model-net
model-net provides a few messaging routines for use by LPs. Each specify a message payload and an event to execute upon message receipt. They allow for:
- push-style messaging, e.g., MPI_Send with an asynchronous MPI_Recv at the receiver.
- pull-style messaging (e.g., RMA). E.g., LP x issues a pull message event, which reaches LP y. Upon receipt, LP y sends the specified message payload back to LP x. LP x executes the event it specified once the message payload is fully received. Note that, at the modeler level, The pull message event is not seen by the destination LP (LP y in this example).
- (experimental) initial collective communication support. This is currently being evaluated - more documentation on this feature will follow.
The messaging types, with the exception of the collective routines, support specializing by LP annotation, only communicating through model-net LPs containing the given annotation. Additionally, there is initial support for different message scheduling algorithms - see codes/model-net-sched.h and the following section. In particular, message-specific parameters can be set via the model_net_set_msg_param function in codes/model-net.h, which currently is only used with the "priority" scheduler.
The messaging API is given by codes/model-net.h, through the model_net_event family of functions. Usage can be found in the example program of the codes-base repository.
LP configuration / mapping
model-net LPs, like any other LP in CODES/ROSS, require specification in the configuration file as well as LP-specific parameters. To identify model-net LPs to CODES, we use the naming scheme of prefixing model-net LP names with the string "modelnet_". For example, the "simplenet" LP (see "model-net models") would be specified in the LPGROUPS section as "modelnet_simplenet".
Currently, model-net LPs expect their configuration values in the PARAMS section. This is due to historical precedent and may be changed in the future to use their own dedicated section. There are a combination of general model-net parameters and model-specific parameters. The general parameters are:
- modelnet_order - this is the only required option. model-net assigns integer IDs to the different types of networks used (independent of annotation) so that the user can differentiate between different network topologies without knowing the particular models used. All models used in the CODES configuration file must be referenced in the parameter. See doc/example_heterogeneous in the codes-base repository.
- packet_size - the size of packets in bytes. model-net will break messages into
- modelnet_scheduler - the algorithm for scheduling packets when multiple concurrent messages are being processed. Options can be found in the SCHEDULER_TYPES macro in codes/model-net-sched.h). In particular, the "priority" scheduler requires two additional parameters in PARAMS: "prio-sched-num-prios" and "prio-sched-sub-sched", the former of which sets the number of priorities to use and the latter of which sets the scheduler used for messages with the same priority.
model-net tracks basic statistics for each LP such as bytes sent/received and time spent sending/receiving messages. If LP-IO has been enabled in the program, they will be printed to the specified directory.
Network models supported
Currently, model-net contains a combination of analytical models and specific models for high-performance networking, with an HPC bent.
Configuration files for each model can be found in tests/conf, under the name "model-net-test*.conf".
The Simplenet and SimpleP2P models (model-net LP names "simplenet" and "simplep2p") are basic queued point-to-point latency/bandwidth models, assuming infinite packet buffering when routing. These are best used for models that require little fidelity out of the network performance. SimpleP2P is the same model as Simplenet, except it provides heterogeneous link capacities from point to point. Rather than a single entry, it requires files containing a matrix of point-to-point bandwidths and latencies.
Simplenet models require two configuration parameters: "net_startup_ns" and "net_bw_mbps", which define the startup and bandwidth costs in nanoseconds and megabytes per second, respectively. SimpleP2P requires two configuration files for the latency and bandwidth costs: "net_latency_ns_file" and "net_bw_mbps_file".
More details about the models can be found at src/networks/model-net/doc/README.simplenet.txt and src/networks/model-net/doc/README.simplep2p.txt, respectively.
The LogGP model (model-net LP name: "loggp"), based on the LogP model, is a more advanced analytical model, containing point-to-point latency ("L"), CPU overhead ("o"), a minimum time between consecutive messages ("gap", or "g"), and a time-per-byte-sent ("G") independent from the gap.
The only configuration entry the LogGP model requires is "PARAMS:net_config_file", which points to a file with path relative to the configuration file.
For more details on gathering parameters for the LogGP model, as well as it's usage and caveats, see Loggp network model.
The torus model (model-net LP name: "torus") represents a torus topology with arbitrary dimension, used primarily in HPC systems. It is based on the work performed in: M. Mubarak, C. D. Carothers, R. B. Ross, P. Carns. “A case study in using massively parallel simulation for extreme-scale torus network co-design”, ACM SIGSIM conference on Principles of Advanced Discrete Simulations (PADS), 2014.
The configuration and model setup can be found at torus network model
The dragonfly model (model-net LP name: "dragonfly") is a network topology that utilizes the concept of virtual routers to produce systems with very high virtual radix out of network components with a lower radix. The topology itself and the simulation model are both described in src/networks/model-net/doc/README.dragonfly.txt. cite).
The configuration parameters are a little trickier here, as additional LPs other than the "modelnet_dragonfly" LP must be specified. "modelnet_dragonfly" represent the node LPs (terminals), while a second type "dragonfly_router" represents a physical router. At least one "dragonfly_router" LP must be present in every LP group with a "modelnet_dragonfly" LP.
Further configuration and model setup can be found at Dragonfly model dally style. A newer version of dragonfly model that supports a Cray style network is also available. Instructions can be found at Dragonfly Model Cray style.
Introduced by Besta and Hoefler, the slim fly is a hierarchical topology having many groups of interconnected routers constructed following McKay, Miller, Siran (MMS) graphs. Routers are grouped into two subgraphs, each containing "q" groups with "q" routers per group. Unlike the dragonfly network, the slimfly does not have fully connected router groups. Intragroup connections are determined by equations (1) and (2) (equation one for routers in subgraph 0 and equation 2 for routers in subgraph 1). No connections exist between router groups within the same subgraph, resulting in a bipartite graph between subgraphs. Each router has "h" global connections, which connect to routers in the opposing subgraph and are governed by equation (3). Each router has “p” number of compute nodes connected to it. The theoretical upper bound for bisection bandwidth for the Slim Fly topology is achieved when p = floor(k/2).
router(0, x, y) is connected to (0, x, y') iff y-y' in X (1) router(1, m, c) is connected to (1, m, c') iff c-c' in X' (2) router(0, x, y) is connected to (1, m, c) iff y = mx+c (3)
Further configuration and model setup can be found at slim fly network model.
The fat tree network model in CODES is a 3-level fat tree network with options for pruned-fat tree (Summit system style) and full fat tree configurations. Routing options supported are static and adaptive routings. For more details on configuration and model setup, see multirail fat tree model.