Running CODES with Cortex and DUMPI traces
This section assumes you have installed CODES with Cortex as explained here. You do not need to have enabled the Python support.
Setting up traces
Create a folder "runs" from which you will execute your experiments. In this folder, create a subfolder "traces", and uncompress a set of DUMPI traces in a subfolder of "traces". For instance, runs/traces/neurones will contain your set of DUMPI traces.
Preparing configuration files
Copy the file $HOME/CODES/codes/src/network-workloads/conf/modelnet-mpi-test-dfly-amg-216.conf as config.conf in your runs folder (or another network configuration file -- this one is the one that was used for testing).
Create an allocation file "alloc.conf". This allocation file contains a list of N integers, where N is the number of processes of the application you wish to simulate (it should correspond to the number of traces). Each integer represents the ID of the compute node on which a process is run. An easy way of generating a contiguous allocation with the right number of IDs is to use the following bash script (change it for your case):
TRACE_DIR="traces/neurones"
TRACE_PFX="dumpi-2016.09.14.15.05.22-"
rm -f alloc.conf
x=`ls -l $TRACE_DIR/$TRACE_PFX*.bin | wc -l`
for ((i=0; i < $x; i++))
do
printf "$i " >> alloc.conf
done
Run your experiment
The following script should help you automatize the run:
#!/bin/sh
TRACE_DIR="traces/neurones"
TRACE_PFX="dumpi-2016.09.14.15.05.22-"
OUTPUT_DIR="results"
CODES="$HOME/CODES/install/codes/bin/model-net-mpi-replay"
NUM_TRACES=`ls -l $TRACE_DIR/$TRACE_PFX* | wc -l`
PARAMS="--sync=1 \
--num_net_traces=$NUM_TRACES \
--workload_file=$TRACE_DIR/$TRACE_PFX \
--lp-io-dir=$OUTPUT_DIR \
--lp-io-use-suffix=1 \
--workload_type=dumpi \
--alloc_file=alloc.conf"
CONFIG="config.conf"
$CODES $PARAMS -- $CONFIG