Synthetic traffic replay on dragonfly network model
Traffic patterns supported
- Uniform random traffic: sends messages to a randomly selected destination node. This traffic pattern is uniformly distributed throughout the network and gives a better performance with minimal routing as compared to non-minimal or adaptive routing.
- Nearest group traffic: with minimal routing, it sends traffic to the single global channel connecting two groups (it congests the network when using minimal routing). This pattern performs better with non-minimal and adaptive routing algorithms.
- Nearest neighbor traffic: it sends traffic to the next node, potentially connected to the same router.
Collecting binned data
The modelnet_enable_sampling function takes a sampling interval "t" and an end time in nanosecs. Over this end time, dragonfly model will collect compute node and router samples after every "t" simulated nanoseconds. The names of the sampling output files can be specified in the config file using cn_sample_file and rt_sample_file arguments. By default the compute node and router outputs will be sent to dragonfly-cn-sampling-%d.bin and dragonfly-router-sampling-%d.bin. Corresponding metadata files are also generated that gives information on the file format, dragonfly configuration being used, router radix etc.
An example utility that reads the binary files and translates it into text format can be found at src/networks/model-net/read-dragonfly-sample.c (Note that the router radix aka RADIX needs to be tuned with the dragonfly configuration in the utility to enable continguous array allocation. By default the radix is set to 16 corresponding to a 1,056 node dragonfly network). The utility can be built using mpicc and it expects the generated binary files to be in the same directory when doing the translation from binary into text.
Running simulations
ROSS optimistic mode:
mpirun -np 4 ./install/bin/model-net-synthetic --sync=3 --traffic=1
--lp-io-dir=mn_synthetic --lp-io-use-suffix=1 --arrival_time=1000.0 --
src/network-workloads/conf/modelnet-synthetic-dragonfly.conf
ROSS serial mode:
./install/bin/model-net-synthetic --sync=1 --traffic=1
--lp-io-dir=mn_synthetic --lp-io-use-suffix=1 --arrival_time=1000.0 --
src/network-workloads/conf/modelnet-synthetic-dragonfly.conf
options:
-
arrival_time: inter-arrival time between the messages. Smaller inter-arrival time means messages will arrive more frequently (smaller inter-arrival time can cause congestion in the network).
-
num_msgs: number of messages generated per terminal. Each message has a size of 2048 bytes. By default, 20 messages per terminal are generated.
-
traffic: 1 for uniform random traffic, 2 for nearest group traffic and 3 for nearest neighbor traffic.
-
sampling-interval: this parameter can be used to configure the sampling interval.
-
sampling-end-time: this parameter can be used to configure end time.
-
lp-io-dir: generates network traffic information on dragonfly terminals and routers. Here is information on individual files:
-
dragonfly-router-stats: Has information on how much time each link of a router spent with its buffer space full. With this information, we can know which links of a router had more congestion than the others.
-
dragonfly-msg-stats: has overall network traffic information i.e. how long the terminal links connected to the routers were congested, amount of data received by each terminal, time spent in receiving the data, number of packets finished, average number of hops traversed by the receiving packets.
Synthetic traffic replay on slim fly network model
Traffic patterns supported:
(1) Uniform random traffic: sends messages to a randomly selected destination node. This traffic pattern is uniformly distributed throughout the network and gives a better performance with minimal routing as compared to non-minimal or adaptive routing.
(2) Worst-case traffic: simulates an application that is communicating in a manner that fully saturates links in the network and thus creates a bottleneck for minimal routing. In this workload, each compute node in a router, R1, will communicate to a node within a paired router that is the maximum two hops away. Another pair of routers that share the same middle link with the previous pair of routers will be established to fully saturate that center link. This setup of network communication puts a worst-case burden on the link between routers 2 and 3 as 4p nodes are creating 2p data flows. With all nodes paired in this configuration, congestion quickly builds up for all nodes in the system and limits maximum throughput to 1/2p.
Running Simulations
ROSS optimistic mode:
mpirun -n 4 ./install/bin/model-net-synthetic-slimfly --sync=3 --traffic=1
--lp-io-dir=mn_synthetic --lp-io-use-suffix=1
--load=0.95 -- ../../jenkins/codes/src/network-workloads/conf/modelnet-synthetic-slimfly-min.conf
ROSS serial mode:
./install/bin/model-net-synthetic-slimfly --sync=1 --traffic=1
--lp-io-dir=mn_synthetic --lp-io-use-suffix=1
--load=0.95 -- ../src/network-workloads/conf/modelnet-synthetic-slimfly-min.conf
options:
load: percentage of link bandwidth each compute node is to utilize. Each node will generate packets at a rate that will maintain the given load's link utilization.
traffic: 1 for uniform random traffic, 2 for worst-case traffic.
*slimfly-results-log.txt: Has information on each slim fly execution including, model size, LPs, PEs, latency, efficiency and run time. Results are appended after each execution.
Synthetic traffic replay on fat tree network model:
TODO