Commit d749acb6 authored by Jonathan Jenkins's avatar Jonathan Jenkins

Getting started document and minor fixes to model READMEs

parent 82bd3ad7
This document serves the following purposes:
* Introduce and present an overview of the key components making up the CODES
networking library "model-net", along with available models and utilities.
= model-net overview
model-net is a set of networking APIs and models intended to provide
transparent message passing support between LPs, allowing the underlying
network LPs to do the work of routing while user LPs model their applications.
It consists of a number of both simple and complex network models as well as
configuration utilities and communication APIs. A somewhat stale overview is
also given at src/models/networks/model-net/doc/README.
= Components of model-net
== messaging API
model-net provides a few messaging routines for use by LPs. Each specify a
message payload and an event to execute upon message receipt. They allow for:
* push-style messaging, e.g., MPI_Send with an asynchronous MPI_Recv at the
receiver.
* pull-style messaging (e.g., RMA). E.g., LP x issues a pull message event,
which reaches LP y. Upon receipt, LP y sends the specified message payload
back to LP x. LP x executes the event it specified once the message payload
is fully received. Note that, at the modeler level, The pull message event is
not seen by the destination LP (LP y in this example).
* (experimental) initial collective communication support. This is currently
being evaluated - more documentation on this feature will follow.
The messaging types, with the exception of the collective routines,
support specializing by LP annotation, only communicating through model-net LPs
containing the given annotation. Additionally, there is initial support for
different message scheduling algorithms - see codes/model-net-sched.h and the
following section. In particular, message-specific parameters can be set via
the model_net_set_msg_param function in codes/model-net.h, which currently is
only used with the "priority" scheduler.
The messaging API is given by codes/model-net.h, through the
model_net_event family of functions. Usage can be found in the example program
of the codes-base repository.
== LP configuration / mapping
model-net LPs, like any other LP in CODES/ROSS, require specification in the
configuration file as well as LP-specific parameters. To identify model-net LPs
to CODES, we use the naming scheme of prefixing model-net LP names with the
string "modelnet_". For example, the "simplenet" LP (see "model-net models")
would be specified in the LPGROUPS section as "modelnet_simplenet".
Currently, model-net LPs expect their configuration values in the PARAMS
section. This is due to historical precedent and may be changed in the future
to use their own dedicated section. There are a combination of general
model-net parameters and model-specific parameters. The general parameters are:
* modelnet_order - this is the only required option. model-net assigns integer
IDs to the different types of networks used (independent of annotation) so
that the user can differentiate between different network topologies without
knowing the particular models used. All models used in the CODES
configuration file must be referenced in the parameter. See
doc/example_heterogeneous in the codes-base repository.
* packet_size - the size of packets model-net will break messages into
* modelnet_scheduler - the algorithm for scheduling packets when multiple
concurrent messages are being processed. Options can be found in the
SCHEDULER_TYPES macro in codes/model-net-sched.h). In particular, the
"priority" scheduler requires two additional parameters in PARAMS:
"prio-sched-num-prios" and "prio-sched-sub-sched", the former of which sets
the number of priorities to use and the latter of which sets the scheduler
used for messages with the same priority.
== Statistics tracking
model-net tracks basic statistics for each LP such as bytes sent/received and
time spent sending/receiving messages. If LP-IO has been enabled in the
program, they will be printed to the specified directory.
= model-net models
Currently, model-net contains a combination of analytical models and specific
models for high-performance networking, with an HPC bent.
Configuration files for each model can be found in tests/conf, under the name
"model-net-test*.conf".
== Simplenet/Simplewan
The Simplenet and Simplewan models (model-net LP names "simplenet" and
"simplewan") are basic queued point-to-point latency/bandwidth models, assuming
infinite packet buffering when routing. These are best used for models that
require little fidelity out of the network performance. Simplewan is the same
model as Simplenet, except it provides heterogeneous link capacities from point
to point. Rather than a single entry, it requires files containing a matrix of
point-to-point bandwidths and latencies.
Simplenet models require two configuration parameters: "net_startup_ns" and
"net_bw_mbps", which define the startup and bandwidth costs in nanoseconds and
megabytes per second, respectively. Simplewan requires two configuration files
for the startup and bandwidth costs: "net_startup_ns_file" and
"net_bw_mbps_file".
More details about the models can be found at
src/models/networks/model-net/doc/README.simplenet.txt and
src/models/networks/model-net/doc/README.simplewan.txt, respectively.
== LogGP
The LogGP model (model-net LP name: "loggp"), based on the LogP
model, is a more advanced analytical model, containing point-to-point latency
("L"), CPU overhead ("o"), a minimum time between consecutive messages ("gap",
or "g"), and a time-per-byte-sent ("G") independent from the gap.
The only configuration entry the LogGP model requires is
"PARAMS:net_config_file", which points to a file with path *relative* to the
configuration file.
For more details on gathering parameters for the LogGP model, as well as it's
usage and caveats, see the document src/models/model-net/doc/README.loggp.txt.
== Torus
The torus model (model-net LP name: "torus") represents a torus topology with
arbitrary dimension, used primarily in HPC systems. It is based on the work
performed in (TODO: cites).
The configuration parameters are as follows:
* n_dims - the dimensionality of the torus
* dim_length - the dimensions of the torus. For example, "4,2,2,2" describes a
four-dimensional, 4x2x2x2 torus.
* link_bandwidth - the bandwidth available per torus link
* buffer_size - the buffer size available at each torus node
* num_vc - number of virtual channels (currently unused)
* chunk_size - element size per transfer. Messages/packets are sent in
individual chunks. This is typically a small number (e.g., 32 bytes)
== Dragonfly
The dragonfly model (model-net LP name: "dragonfly") is an experimental network
topology that utilizes the concept of virtual routers to produce systems with
very high virtual radix out of network components with a lower radix. The
topology itself and the simulation model are both described in
src/models/networks/model-net/doc/README.dragonfly.txt.
cite).
The configuration parameters are a little trickier here, as additional LPs
other than the "modelnet_dragonfly" LP must be specified. "modelnet_dragonfly"
represent the node LPs (terminals), while a second type "dragonfly_router"
represents a physical router. At least one "dragonfly_router" LP must be
present in every LP group with a "modelnet_dragonfly" LP.
Further configuration and model setup can be found at
src/models/model-net/doc/README.dragonfly.txt.
......@@ -3,12 +3,3 @@
The torus configuration parameters should set an upper bound on the number of servers/compute nodes.
- Change the buffer overflow message? May be allow a slowed down packet injection rate if the buffer overflows and report the average injection rate at the
end of the simulation.
2- Mapping:
- Update the mapping code to handle remainder (where number of ROSS LPs are not evenly divisible by the number of PEs).
- Make the mem_factor argument configurable in the codes mapping file (used for g_tw_events_per_pe).
- Remove the group id and type id arguments from get_lp_info function.
Make the model-net test case running for two different types of networks?
- Use the I/O activity trace logs from BG/Q and BG/P for I/O modelling.
......@@ -62,7 +62,6 @@ connections between routers, network nodes and the servers.
Some other dragonfly specific parameters in the PARAMS section are
- packet_size: size of a dragonfly packet in bytes (default set to 512 bytes)
- num_vcs: number of virtual channels connecting a router-router, node-router (default set to 1)
- local_vc_size: Number of packet chunks (default: 32 bytes) that can fit in the channel connecting routers
within the same group.
......
......@@ -27,7 +27,7 @@ data format as mpptest.
The loggp modelnet method reads in output produced by Torsten's netgauge
tool and uses it as a lookup table to calculate delays for network messages.
You can find an exmaple of netgauge output from the Tukey system at ANL
You can find an example of netgauge output from the Tukey system at ANL
(using MVAPICH over InfiniBand) in tests/ng-mpi-tukey.dat.
A few notes on the implementation:
......@@ -64,4 +64,4 @@ A few notes on the implementation:
handling network card contention. It tracks the state of input and output
queues in terms of when they will next be available for communication and
adjusts those times as new messages are scheduled for communication. The
model assuems infinite buffering in the network switch fabric.
model assumes infinite buffering in the network switch fabric.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment