Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
This document serves the following purposes:
* Introduce and present an overview of the key components making up the CODES
  networking library "model-net", along with available models and utilities.

= model-net overview

model-net is a set of networking APIs and models intended to provide
transparent message passing support between LPs, allowing the underlying
network LPs to do the work of routing while user LPs model their applications.
It consists of a number of both simple and complex network models as well as
configuration utilities and communication APIs. A somewhat stale overview is
also given at src/models/networks/model-net/doc/README.

= Components of model-net

== messaging API

model-net provides a few messaging routines for use by LPs. Each specify a
message payload and an event to execute upon message receipt. They allow for:
* push-style messaging, e.g., MPI_Send with an asynchronous MPI_Recv at the
* pull-style messaging (e.g., RMA). E.g., LP x issues a pull message event,
  which reaches LP y. Upon receipt, LP y sends the specified message payload
  back to LP x. LP x executes the event it specified once the message payload
  is fully received. Note that, at the modeler level, The pull message event is
  not seen by the destination LP (LP y in this example).
* (experimental) initial collective communication support. This is currently
  being evaluated - more documentation on this feature will follow.

The messaging types, with the exception of the collective routines,
support specializing by LP annotation, only communicating through model-net LPs
containing the given annotation. Additionally, there is initial support for
different message scheduling algorithms - see codes/model-net-sched.h and the
following section. In particular, message-specific parameters can be set via
the model_net_set_msg_param function in codes/model-net.h, which currently is
only used with the "priority" scheduler.

The messaging API is given by codes/model-net.h, through the
model_net_event family of functions. Usage can be found in the example program
of the codes-base repository.

== LP configuration / mapping

model-net LPs, like any other LP in CODES/ROSS, require specification in the
configuration file as well as LP-specific parameters. To identify model-net LPs
to CODES, we use the naming scheme of prefixing model-net LP names with the
string "modelnet_". For example, the "simplenet" LP (see "model-net models")
would be specified in the LPGROUPS section as "modelnet_simplenet". 

Currently, model-net LPs expect their configuration values in the PARAMS
section. This is due to historical precedent and may be changed in the future
to use their own dedicated section. There are a combination of general
model-net parameters and model-specific parameters. The general parameters are:
* modelnet_order - this is the only required option. model-net assigns integer
  IDs to the different types of networks used (independent of annotation) so
  that the user can differentiate between different network topologies without
  knowing the particular models used. All models used in the CODES
  configuration file must be referenced in the parameter. See
  doc/example_heterogeneous in the codes-base repository.
* packet_size - the size of packets model-net will break messages into
* modelnet_scheduler - the algorithm for scheduling packets when multiple
  concurrent messages are being processed. Options can be found in the
  SCHEDULER_TYPES macro in codes/model-net-sched.h). In particular, the
  "priority" scheduler requires two additional parameters in PARAMS:
  "prio-sched-num-prios" and "prio-sched-sub-sched", the former of which sets
  the number of priorities to use and the latter of which sets the scheduler
  used for messages with the same priority.

== Statistics tracking

model-net tracks basic statistics for each LP such as bytes sent/received and
time spent sending/receiving messages. If LP-IO has been enabled in the
program, they will be printed to the specified directory.

= model-net models

Currently, model-net contains a combination of analytical models and specific
models for high-performance networking, with an HPC bent. 

Configuration files for each model can be found in tests/conf, under the name

Jonathan Jenkins's avatar
Jonathan Jenkins committed
== Simplenet/SimpleP2P

Jonathan Jenkins's avatar
Jonathan Jenkins committed
85 86
The Simplenet and SimpleP2P models (model-net LP names "simplenet" and
"simplep2p") are basic queued point-to-point latency/bandwidth models, assuming
infinite packet buffering when routing. These are best used for models that
Jonathan Jenkins's avatar
Jonathan Jenkins committed
require little fidelity out of the network performance. SimpleP2P is the same
89 90 91 92 93 94
model as Simplenet, except it provides heterogeneous link capacities from point
to point. Rather than a single entry, it requires files containing a matrix of
point-to-point bandwidths and latencies. 

Simplenet models require two configuration parameters: "net_startup_ns" and
"net_bw_mbps", which define the startup and bandwidth costs in nanoseconds and
Jonathan Jenkins's avatar
Jonathan Jenkins committed
95 96
megabytes per second, respectively. SimpleP2P requires two configuration files
for the latency and bandwidth costs: "net_latency_ns_file" and
97 98 99 100

More details about the models can be found at
src/models/networks/model-net/doc/README.simplenet.txt and
Jonathan Jenkins's avatar
Jonathan Jenkins committed
src/models/networks/model-net/doc/README.simplep2p.txt, respectively.
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

== LogGP

The LogGP model (model-net LP name: "loggp"), based on the LogP
model, is a more advanced analytical model, containing point-to-point latency
("L"), CPU overhead ("o"), a minimum time between consecutive messages ("gap",
or "g"), and a time-per-byte-sent ("G") independent from the gap.

The only configuration entry the LogGP model requires is
"PARAMS:net_config_file", which points to a file with path *relative* to the
configuration file.

For more details on gathering parameters for the LogGP model, as well as it's
usage and caveats, see the document src/models/model-net/doc/README.loggp.txt. 

== Torus

The torus model (model-net LP name: "torus") represents a torus topology with
arbitrary dimension, used primarily in HPC systems. It is based on the work
Jonathan Jenkins's avatar
Jonathan Jenkins committed
121 122 123 124 125
performed in:
    M. Mubarak, C. D. Carothers, R. B. Ross, P. Carns. “A case study in using
    massively parallel simulation for extreme-scale torus network co-design”,
    ACM SIGSIM conference on Principles of Advanced Discrete Simulations
    (PADS), 2014.
126 127 128 129 130 131 132

The configuration parameters are as follows:
* n_dims - the dimensionality of the torus
* dim_length - the dimensions of the torus. For example, "4,2,2,2" describes a
  four-dimensional, 4x2x2x2 torus.
* link_bandwidth - the bandwidth available per torus link
* buffer_size - the buffer size available at each torus node
Jonathan Jenkins's avatar
Jonathan Jenkins committed
133 134
* num_vc - number of virtual channels (currently unused - all traffic goes
  through a single channel)
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
* chunk_size - element size per transfer. Messages/packets are sent in
  individual chunks. This is typically a small number (e.g., 32 bytes)

== Dragonfly

The dragonfly model (model-net LP name: "dragonfly") is an experimental network
topology that utilizes the concept of virtual routers to produce systems with
very high virtual radix out of network components with a lower radix. The
topology itself and the simulation model are both described in

The configuration parameters are a little trickier here, as additional LPs
other than the "modelnet_dragonfly" LP must be specified. "modelnet_dragonfly"
represent the node LPs (terminals), while a second type "dragonfly_router"
represents a physical router. At least one "dragonfly_router" LP must be
present in every LP group with a "modelnet_dragonfly" LP. 

Further configuration and model setup can be found at