...
 
Commits (62)
Contributors to date (in chronological order, with current affiliations) Contributors to date (with affiliations at time of contribution)
- Ning Liu, IBM
- Jason Cope, DDN - Philip Carns, Argonne National Laboratory
- Philip Carns, Argonne National Labs - Misbah Mubarak, Argonne National Laboratory
- Misbah Mubarak, Argonne National Labs - Shane Snyder, Argonne National Laboratory
- Shane Snyder, Argonne National Labs - Jonathan P. Jenkins, Argonne National Laboratory
- Jonathan P. Jenkins - Noah Wolfe, Rensselaer Polytechnic Institute
- Noah Wolfe, RPI - Nikhil Jain, Lawrence Livermore National Laboratory
- Nikhil Jain, Lawrence Livermore Labs - Jens Domke, Univ. of Dresden
- Matthieu Dorier, Argonne National Labs - Giorgis Georgakoudis, Lawrence Livermore National Laboratory
- Caitlin Ross, RPI - Matthieu Dorier, Argonne National Laboratory
- Xu Yang, Amazon - Caitlin Ross, Rennselaer Polytechnic Institute
- Xu Yang, Illinois Institute of Tech.
- Jens Domke, Tokyo Institute of Tech. - Jens Domke, Tokyo Institute of Tech.
- Xin Wang, IIT - Xin Wang, Illinois Institute of Tech.
- Neil McGlohon, Rensselaer Polytechnic Institute
- Elsa Gonsiorowski, Rensselaer Polytechnic Institute
- Justin M. Wozniak, Argonne National Laboratory
- Robert B. Ross, Argonne National Laboratory
- Lee Savoie, Univ. of Arizona
- Ning Liu, Rensselaer Polytechnic Institute
- Jason Cope, Argonne National Laboratory
Contributions:
Contributions of external (non-Argonne) collaborators: Misbah Mubarak (ANL)
- Introduced 1-D dragonfly and enhanced torus network model.
- Added quality of service in dragonfly and megafly network models.
- Added MPI simulation layer to simulate MPI operations.
- Updated and merged burst buffer storage model with 2-D dragonfly.
- Added and validated 2-D dragonfly network model.
- Added multiple workload sources including MPI communication, Scalable
Workload Models, DUMPI communication traces.
- Added online simulation capability with Argobots and SWMs.
- Instrumented the network models to report time-stepped series statistics.
- Bug fixes for network, storage and workload models with CODES.
Neil McGlohon (RPI)
- Introduced Dragonfly Plus/Megafly network model.
- Merged 1-D dragonfly and 2-D dragonfly network models.
- Updated adaptive routing in megafly and 1-D dragonfly network models.
- Extended slim fly network model's dual-rail mode to arbitrary number of rails (pending).
Nikhil Jain, Abhinav Bhatele (LLNL) Nikhil Jain, Abhinav Bhatele (LLNL)
- Improvements in credit-based flow control of CODES dragonfly and torus network models. - Improvements in credit-based flow control of CODES dragonfly and torus network models.
...@@ -29,6 +55,12 @@ Jens Domke (U. of Dresden) ...@@ -29,6 +55,12 @@ Jens Domke (U. of Dresden)
- Static routing in fat tree network model including ground work for - Static routing in fat tree network model including ground work for
dumping the topology and reading the routing tables. dumping the topology and reading the routing tables.
John Jenkins
- Introduced storage models in a separate codes-storage-repo.
- Enhanced the codes-mapping APIs to map advanced combinations on PEs.
- Bug fixing with network models.
- Bug fixing with MPI simulation layer.
Xu Yang (IIT) Xu Yang (IIT)
- Added support for running multiple application workloads with CODES MPI - Added support for running multiple application workloads with CODES MPI
Simulation layer, along with supporting scripts and utilities. Simulation layer, along with supporting scripts and utilities.
...@@ -39,6 +71,7 @@ Noah Wolfe (RPI): ...@@ -39,6 +71,7 @@ Noah Wolfe (RPI):
- Added a fat tree network model that supports full and pruned fat tree - Added a fat tree network model that supports full and pruned fat tree
network. network.
- Added a multi-rail implementation for the fat tree networks (pending). - Added a multi-rail implementation for the fat tree networks (pending).
- Added a dual-rail implementation for slim fly networks (pending).
- Bug reporter for CODES network models. - Bug reporter for CODES network models.
Caitlin Ross (RPI): Caitlin Ross (RPI):
......
COPYRIGHT
The following is a notice of limited availability of the code, and disclaimer
which must be included in the prologue of the code and in all source listings
of the code.
Copyright Notice
+ 2013 University of Chicago
Permission is hereby granted to use, reproduce, prepare derivative works, and
to redistribute to others. This software was authored by:
Mathematics and Computer Science Division
Argonne National Laboratory, Argonne IL 60439
(and)
Computer Science Department
Rensselaer Polytechnic Institute, Troy NY 12180
GOVERNMENT LICENSE
Portions of this material resulted from work developed under a U.S.
Government Contract and are subject to the following license: the Government
is granted for itself and others acting on its behalf a paid-up, nonexclusive,
irrevocable worldwide license in this computer software to reproduce, prepare
derivative works, and perform publicly and display publicly.
DISCLAIMER
This computer code material was prepared, in part, as an account of work
sponsored by an agency of the United States Government. Neither the United
States, nor the University of Chicago, nor any of their employees, makes any
warranty express or implied, or assumes any legal liability or responsibility
for the accuracy, completeness, or usefulness of any information, apparatus,
product, or process disclosed, or represents that its use would not infringe
privately owned rights.
************** Copyright © 2019, UChicago Argonne, LLC ***************
All Rights Reserved
Software Name: CO-Design of Exascale Storage and Network Architectures (CODES)
By: Argonne National Laboratory, Rensselaer Polytechnic Institute, Lawrence Livermore National Laboratory, and Illinois Institute of Technology
OPEN SOURCE LICENSE
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
******************************************************************************************************
DISCLAIMER
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
***************************************************************************************************
************** Copyright © 2019, UChicago Argonne, LLC ***************
All Rights Reserved
Software Name: CO-Design of Exascale Storage and Network Architectures (CODES)
By: Argonne National Laboratory, Rensselaer Polytechnic Institute, Lawrence Livermore National Laboratory, and Illinois Institute of Technology
OPEN SOURCE LICENSE
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
******************************************************************************************************
DISCLAIMER
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
***************************************************************************************************
# **IMPORTANT NOTICE**
## THE CODES PROJECT HAS BEEN MOVED
https://github.com/codes-org/codes
As of June 21, 2019, the location of the repo has been moved to GitHub as part of the codebase's shift to its open source BSD License. All issues will be moved in some form to the new location followed by merge requests (if possible). I will do my best to keep the master branch of the two repositories up to date with each other. That being said, if there is any conflict, the GitHub repo should be taken as the truth.
Thank you,
Neil McGlohon
https://github.com/codes-org/codes
# CODES Discrete-event Simulation Framework # CODES Discrete-event Simulation Framework
https://xgitlab.cels.anl.gov/codes/codes/wikis/home https://xgitlab.cels.anl.gov/codes/codes/wikis/home
Discrete event driven simulation of HPC system architectures and subsystems has emerged as a productive and cost-effective means to evaluating potential HPC designs, along with capabilities for executing simulations of extreme scale systems. The goal of the CODES project is use highly parallel simulation to explore the design of exascale storage/network architectures and distributed data-intensive science facilities. Discrete event driven simulation of HPC system architectures and subsystems has emerged as a productive and cost-effective means to evaluating potential HPC designs, along with capabilities for executing simulations of extreme scale systems. The goal of the CODES project is to use highly parallel simulation to explore the design of exascale storage/network architectures and distributed data-intensive science facilities.
Our simulations build upon the Rensselaer Optimistic Simulation System (ROSS), a discrete event simulation framework that allows simulations to be run in parallel, decreasing the simulation run time of massive simulations to hours. We are using ROSS to explore topics including large-scale storage systems, I/O workloads, HPC network fabrics, distributed science systems, and data-intensive computation environments. Our simulations build upon the Rensselaer Optimistic Simulation System (ROSS), a discrete event simulation framework that allows simulations to be run in parallel, decreasing the simulation run time of massive simulations to hours. We are using ROSS to explore topics including large-scale storage systems, I/O workloads, HPC network fabrics, distributed science systems, and data-intensive computation environments.
......
## README for using ROSS instrumentation with CODES ## README for using ROSS instrumentation with CODES
For details about the ROSS instrumentation, see the [ROSS Instrumentation blog post](http://carothersc.github.io/ROSS/instrumentation/instrumentation.html) For details about the ROSS instrumentation, see the [ROSS Instrumentation blog post](http://ross-org.github.io/instrumentation/instrumentation.html)
on the ROSS webpage. on the ROSS webpage.
There are currently 4 types of instrumentation: GVT-based, real time sampling, virtual time sampling, and event tracing. There are currently 4 types of instrumentation: GVT-based, real time sampling, virtual time sampling, and event tracing.
See the ROSS documentation for more info on the specific options or use `--help` with your model. See the ROSS documentation for more info on the specific options or use `--help` with your model.
To collect data about the simulation engine, no changes are needed to model code for any of the instrumentation modes. To collect data about the simulation engine, no changes are needed to model code for any of the instrumentation modes.
Some additions to the model code is needed in order to turn on any model-level data collection. Some additions to the model code is needed in order to turn on any model-level data collection.
See the "Model-level data sampling" section on [ROSS Instrumentation blog post](http://carothersc.github.io/ROSS/instrumentation/instrumentation.html). See the "Model-level data sampling" section on [ROSS Instrumentation blog post](http://ross-org.github.io/instrumentation/instrumentation.html).
Here we describe CODES specific details. Here we describe CODES specific details.
### Register Instrumentation Callback Functions ### Register Instrumentation Callback Functions
...@@ -37,13 +37,13 @@ The second pointer is for the data to be sampled at the GVT or real time samplin ...@@ -37,13 +37,13 @@ The second pointer is for the data to be sampled at the GVT or real time samplin
In this case the LPs have different function pointers since we want to collect different types of data for the two LP types. In this case the LPs have different function pointers since we want to collect different types of data for the two LP types.
For the terminal, I set the appropriate size of the data to be collected, but for the router, the size of the data is dependent on the radix for the dragonfly configuration being used, which isn't known until runtime. For the terminal, I set the appropriate size of the data to be collected, but for the router, the size of the data is dependent on the radix for the dragonfly configuration being used, which isn't known until runtime.
*Note*: You can only reuse the function for event tracing for LPs that use the same type of message struct. *Note*: You can only reuse the function for event tracing for LPs that use the same type of message struct.
For example, the dragonfly terminal and router LPs both use the `terminal_message` struct, so they can For example, the dragonfly terminal and router LPs both use the `terminal_message` struct, so they can
use the same functions for event tracing. use the same functions for event tracing.
However the model net base LP uses the `model_net_wrap_msg` struct, so it gets its own event collection function and `st_trace_type` struct, in order to read the event type correctly from the model. However the model net base LP uses the `model_net_wrap_msg` struct, so it gets its own event collection function and `st_trace_type` struct, in order to read the event type correctly from the model.
In the ROSS instrumentation documentation, there are two methods provided for letting ROSS know about these `st_model_types` structs. In the ROSS instrumentation documentation, there are two methods provided for letting ROSS know about these `st_model_types` structs.
In CODES, this step is a little different, as `codes_mapping_setup()` calls `tw_lp_settype()`. In CODES, this step is a little different, as `codes_mapping_setup()` calls `tw_lp_settype()`.
Instead, you add a function to return this struct for each of your LP types: Instead, you add a function to return this struct for each of your LP types:
```C ```C
static const st_model_types *dragonfly_get_model_types(void) static const st_model_types *dragonfly_get_model_types(void)
...@@ -73,7 +73,7 @@ static void router_register_model_types(st_model_types *base_type) ...@@ -73,7 +73,7 @@ static void router_register_model_types(st_model_types *base_type)
At this point, there are two different steps to follow depending on whether the model is one of the model-net models or not. At this point, there are two different steps to follow depending on whether the model is one of the model-net models or not.
##### Model-net Models ##### Model-net Models
In the `model_net_method` struct, two fields have been added: `mn_model_stat_register` and `mn_get_model_stat_types`. In the `model_net_method` struct, two fields have been added: `mn_model_stat_register` and `mn_get_model_stat_types`.
You need to set these to the functions described above. For example: You need to set these to the functions described above. For example:
```C ```C
...@@ -109,27 +109,27 @@ st_model_types svr_model_types[] = { ...@@ -109,27 +109,27 @@ st_model_types svr_model_types[] = {
static void svr_register_model_types() static void svr_register_model_types()
{ {
st_model_type_register("server", &svr_model_types[0]); st_model_type_register("ns-lp", &svr_model_types[0]);
} }
int main(int argc, char **argv) int main(int argc, char **argv)
{ {
// ... some set up removed for brevity // ... some set up removed for brevity
model_net_register(); model_net_register();
svr_add_lp_type(); svr_add_lp_type();
if (g_st_ev_trace || g_st_model_stats) if (g_st_ev_trace || g_st_model_stats)
svr_register_model_types(); svr_register_model_types();
codes_mapping_setup(); codes_mapping_setup();
//... //...
} }
``` ```
`g_st_ev_trace` is a ROSS flag for determining if event tracing is turned on and `g_st_model_stats` determines if the GVT-based or real time instrumentation `g_st_ev_trace` is a ROSS flag for determining if event tracing is turned on and `g_st_model_stats` determines if the GVT-based or real time instrumentation
modes are collecting model-level data as well. modes are collecting model-level data as well.
### CODES LPs that currently have event type collection implemented: ### CODES LPs that currently have event type collection implemented:
...@@ -144,4 +144,3 @@ If you're using any of the following CODES models, you don't have to add anythin ...@@ -144,4 +144,3 @@ If you're using any of the following CODES models, you don't have to add anythin
- slimfly router and terminal LPs (slimfly.c) - slimfly router and terminal LPs (slimfly.c)
- fat tree switch and terminal LPs (fat-tree.c) - fat tree switch and terminal LPs (fat-tree.c)
- model-net-base-lp (model-net-lp.c) - model-net-base-lp (model-net-lp.c)
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script. # Process this file with autoconf to produce a configure script.
AC_PREREQ([2.67]) AC_PREREQ([2.67])
AC_INIT([codes], [1.0.0], [http://trac.mcs.anl.gov/projects/codes/newticket],[],[http://www.mcs.anl.gov/projects/codes/]) AC_INIT([codes], [1.1.0], [http://trac.mcs.anl.gov/projects/codes/newticket],[],[http://www.mcs.anl.gov/projects/codes/])
LT_INIT LT_INIT
......
...@@ -2,12 +2,12 @@ NOTE: see bottom of this file for suggested configurations on particular ANL ...@@ -2,12 +2,12 @@ NOTE: see bottom of this file for suggested configurations on particular ANL
machines. machines.
0 - Checkout, build, and install the trunk version of ROSS 0 - Checkout, build, and install the trunk version of ROSS
(https://github.com/carothersc/ROSS). At the time of (https://github.com/ross-org/ROSS). At the time of
release (0.6.0), ROSS's latest commit hash was 10d7a06b2d, so this revision is release (0.6.0), ROSS's latest commit hash was 10d7a06b2d, so this revision is
"safe" in the unlikely case incompatible changes come along in the future. If "safe" in the unlikely case incompatible changes come along in the future. If
working from the CODES master branches, use the ROSS master branch. working from the CODES master branches, use the ROSS master branch.
git clone http://github.com/carothersc/ROSS.git git clone http://github.com/ross-org/ROSS.git
# if using 0.5.2 release: git checkout d3bdc07 # if using 0.5.2 release: git checkout d3bdc07
cd ROSS cd ROSS
mkdir build mkdir build
...@@ -22,7 +22,7 @@ working from the CODES master branches, use the ROSS master branch. ...@@ -22,7 +22,7 @@ working from the CODES master branches, use the ROSS master branch.
ROSS/install/ directory> ROSS/install/ directory>
For more details on installing ROSS, go to For more details on installing ROSS, go to
https://github.com/carothersc/ROSS/blob/master/README.md . https://github.com/ross-org/ROSS/blob/master/README.md .
If using ccmake to configure, don't forget to set CMAKE_C_COMPILER and If using ccmake to configure, don't forget to set CMAKE_C_COMPILER and
CMAKE_CXX_COMPILER to mpicc/mpicxx CMAKE_CXX_COMPILER to mpicc/mpicxx
......
...@@ -14,7 +14,7 @@ https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users ...@@ -14,7 +14,7 @@ https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
== CODES == CODES
* main site: http://www.mcs.anl.gov/projects/codes/ * main site: http://www.mcs.anl.gov/projects/codes/
* repositories: * repositories:
* "base" (this repository): git.mcs.anl.gov:radix/codes-base * "base" (this repository): git.mcs.anl.gov:radix/codes-base
* codes-net (networking component of CODES): git.mcs.anl.gov:radix/codes-net * codes-net (networking component of CODES): git.mcs.anl.gov:radix/codes-net
...@@ -22,11 +22,11 @@ https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users ...@@ -22,11 +22,11 @@ https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
== ROSS == ROSS
* main site, repository, etc.: https://github.com/carothersc/ROSS * main site, repository, etc.: https://github.com/ross-org/ROSS
* both the site and repository contain good documentation as well - refer to * both the site and repository contain good documentation as well - refer to
it for an in-depth introduction and overview of ROSS proper it for an in-depth introduction and overview of ROSS proper
= Components of CODES = Components of CODES
== Configuration == Configuration
...@@ -36,7 +36,7 @@ structured configuration file. The configuration format allows categories, and ...@@ -36,7 +36,7 @@ structured configuration file. The configuration format allows categories, and
optionally subgroups within the category, of key-value pairs for configuration. optionally subgroups within the category, of key-value pairs for configuration.
The LPGROUPS category defines the LP configuration. The PARAMS category is The LPGROUPS category defines the LP configuration. The PARAMS category is
currently used for networking and ROSS-specific parameters. User-defined currently used for networking and ROSS-specific parameters. User-defined
categories can also be used. categories can also be used.
The configuration system additionally allows LP specialization via the usage of The configuration system additionally allows LP specialization via the usage of
"annotations". This allows two otherwise identical LPs to have different "annotations". This allows two otherwise identical LPs to have different
...@@ -125,7 +125,7 @@ The format of the metadata file is a set of lines containing: ...@@ -125,7 +125,7 @@ The format of the metadata file is a set of lines containing:
<group ID> <start ID> <end ID inclusive> <kernel file> <group ID> <start ID> <end ID inclusive> <kernel file>
where: where:
* <group ID> is the ID of this group (see restrictions) * <group ID> is the ID of this group (see restrictions)
* <start ID> and <end ID> form the range of logical client IDs that will * <start ID> and <end ID> form the range of logical client IDs that will
perform the given workload. Note that the end ID is inclusive, so a start, perform the given workload. Note that the end ID is inclusive, so a start,
end pair of 0, 3 will include IDs 0, 1, 2, and 3. An <end ID> of -1 indicates end pair of 0, 3 will include IDs 0, 1, 2, and 3. An <end ID> of -1 indicates
to use the remaining number of clients as specified by the user. to use the remaining number of clients as specified by the user.
...@@ -202,6 +202,15 @@ should be everything up to the rank number. E.g., if the dumpi files are of the ...@@ -202,6 +202,15 @@ should be everything up to the rank number. E.g., if the dumpi files are of the
form "dumpi-YYYY.MM.DD.HH.MM.SS-XXXX.bin", then the input should be form "dumpi-YYYY.MM.DD.HH.MM.SS-XXXX.bin", then the input should be
"dumpi-YYYY.MM.DD.HH.MM.SS-" "dumpi-YYYY.MM.DD.HH.MM.SS-"
=== Quality of Service
Two models (dragonfly-dally.C and dragonfly-plus.C) can now support traffic
differentiation and prioritization. The models support quality of service by
directing the network traffic on separate class of virtual channels. Additional
documentation on using traffic classes can be found at the wiki link:
https://xgitlab.cels.anl.gov/codes/codes/wikis/Quality-of-Service
=== Workload generator helpers === Workload generator helpers
The codes-jobmap API (codes/codes-jobmap.h) specifies mechanisms to initialize The codes-jobmap API (codes/codes-jobmap.h) specifies mechanisms to initialize
...@@ -288,7 +297,7 @@ lsm ...@@ -288,7 +297,7 @@ lsm
} }
The API can be found at codes/local-storage-model.h and example usage can be The API can be found at codes/local-storage-model.h and example usage can be
seen in tests/local-storage-model-test.c and tests/conf/lsm-test.conf. seen in tests/local-storage-model-test.c and tests/conf/lsm-test.conf.
The queueing policy of LSM is currently FIFO, and the default mode uses an The queueing policy of LSM is currently FIFO, and the default mode uses an
implicit queue, simply incrementing counters and scheduling future events when implicit queue, simply incrementing counters and scheduling future events when
...@@ -359,7 +368,7 @@ model-net LPs, like any other LP in CODES/ROSS, require specification in the ...@@ -359,7 +368,7 @@ model-net LPs, like any other LP in CODES/ROSS, require specification in the
configuration file as well as LP-specific parameters. To identify model-net LPs configuration file as well as LP-specific parameters. To identify model-net LPs
to CODES, we use the naming scheme of prefixing model-net LP names with the to CODES, we use the naming scheme of prefixing model-net LP names with the
string "modelnet_". For example, the "simplenet" LP (see "model-net models") string "modelnet_". For example, the "simplenet" LP (see "model-net models")
would be specified in the LPGROUPS section as "modelnet_simplenet". would be specified in the LPGROUPS section as "modelnet_simplenet".
Currently, model-net LPs expect their configuration values in the PARAMS Currently, model-net LPs expect their configuration values in the PARAMS
section. This is due to historical precedent and may be changed in the future section. This is due to historical precedent and may be changed in the future
...@@ -389,7 +398,7 @@ program, they will be printed to the specified directory. ...@@ -389,7 +398,7 @@ program, they will be printed to the specified directory.
= model-net models = model-net models
Currently, model-net contains a combination of analytical models and specific Currently, model-net contains a combination of analytical models and specific
models for high-performance networking, with an HPC bent. models for high-performance networking, with an HPC bent.
Configuration files for each model can be found in tests/conf, under the name Configuration files for each model can be found in tests/conf, under the name
"model-net-test*.conf". "model-net-test*.conf".
...@@ -402,7 +411,7 @@ infinite packet buffering when routing. These are best used for models that ...@@ -402,7 +411,7 @@ infinite packet buffering when routing. These are best used for models that
require little fidelity out of the network performance. SimpleP2P is the same require little fidelity out of the network performance. SimpleP2P is the same
model as Simplenet, except it provides heterogeneous link capacities from point model as Simplenet, except it provides heterogeneous link capacities from point
to point. Rather than a single entry, it requires files containing a matrix of to point. Rather than a single entry, it requires files containing a matrix of
point-to-point bandwidths and latencies. point-to-point bandwidths and latencies.
Simplenet models require two configuration parameters: "net_startup_ns" and Simplenet models require two configuration parameters: "net_startup_ns" and
"net_bw_mbps", which define the startup and bandwidth costs in nanoseconds and "net_bw_mbps", which define the startup and bandwidth costs in nanoseconds and
...@@ -426,7 +435,7 @@ The only configuration entry the LogGP model requires is ...@@ -426,7 +435,7 @@ The only configuration entry the LogGP model requires is
configuration file. configuration file.
For more details on gathering parameters for the LogGP model, as well as it's For more details on gathering parameters for the LogGP model, as well as it's
usage and caveats, see the document src/model-net/doc/README.loggp.txt. usage and caveats, see the document src/model-net/doc/README.loggp.txt.
== Torus == Torus
...@@ -454,7 +463,7 @@ The configuration parameters are a little trickier here, as additional LPs ...@@ -454,7 +463,7 @@ The configuration parameters are a little trickier here, as additional LPs
other than the "modelnet_dragonfly" LP must be specified. "modelnet_dragonfly" other than the "modelnet_dragonfly" LP must be specified. "modelnet_dragonfly"
represent the node LPs (terminals), while a second type "dragonfly_router" represent the node LPs (terminals), while a second type "dragonfly_router"
represents a physical router. At least one "dragonfly_router" LP must be represents a physical router. At least one "dragonfly_router" LP must be
present in every LP group with a "modelnet_dragonfly" LP. present in every LP group with a "modelnet_dragonfly" LP.
Further configuration and model setup can be found at Further configuration and model setup can be found at
src/model-net/doc/README.dragonfly.txt. src/model-net/doc/README.dragonfly.txt.
...@@ -509,7 +518,7 @@ section of the example.conf config file. ...@@ -509,7 +518,7 @@ section of the example.conf config file.
== Server state and event handlers == Server state and event handlers
The server LP state maintains a count of the number of remote messages it has The server LP state maintains a count of the number of remote messages it has
sent and received as well as the number of local completion messages. sent and received as well as the number of local completion messages.
For the server event message, we have four message types: KICKOFF, REQ, ACK and For the server event message, we have four message types: KICKOFF, REQ, ACK and
LOCAL. With a KICKOFF event, each LP sends a message to itself to begin the LOCAL. With a KICKOFF event, each LP sends a message to itself to begin the
...@@ -517,7 +526,7 @@ simulation proper. To avoid event ties, we add a small amount of random noise ...@@ -517,7 +526,7 @@ simulation proper. To avoid event ties, we add a small amount of random noise
using codes_local_latency. The REQ message is sent by a server to its using codes_local_latency. The REQ message is sent by a server to its
neighboring server and when received, neighboring server sends back a message neighboring server and when received, neighboring server sends back a message
of type ACK. We've shown a hard-coded direct communication method which of type ACK. We've shown a hard-coded direct communication method which
directly computes the LP ID, and a codes-mapping API-based method. directly computes the LP ID, and a codes-mapping API-based method.
== Server reverse computation == Server reverse computation
...@@ -533,19 +542,19 @@ conservative modes, so reverse computation may not be necessary if the ...@@ -533,19 +542,19 @@ conservative modes, so reverse computation may not be necessary if the
simulation is not compute- or memory-intensive. simulation is not compute- or memory-intensive.
For our example program, recall the "forward" event handlers. They perform the For our example program, recall the "forward" event handlers. They perform the
following: following:
* Kickoff: send a message to the peer server, and increment sender LP's * Kickoff: send a message to the peer server, and increment sender LP's
count of sent messages. count of sent messages.
* Request (received from peer server): increment receiver count of * Request (received from peer server): increment receiver count of
received messages, and send an acknowledgement to the sender. received messages, and send an acknowledgement to the sender.
* Acknowledgement (received from message receiver): send the next * Acknowledgement (received from message receiver): send the next
message to the receiver and increment messages sent count. Set a flag message to the receiver and increment messages sent count. Set a flag
indicating whether a message has been sent. indicating whether a message has been sent.
* Local model-net callback: increment the local model-net * Local model-net callback: increment the local model-net
received messages count. received messages count.
In terms of LP state, the four operations are simply modifying counts. Hence, In terms of LP state, the four operations are simply modifying counts. Hence,
the "reverse" event handlers need to merely roll back those changes: the "reverse" event handlers need to merely roll back those changes:
* Kickoff: decrement sender LP's count of sent messages. * Kickoff: decrement sender LP's count of sent messages.
* Request (received from peer server): decrement receiver count of * Request (received from peer server): decrement receiver count of
received messages. received messages.
......
...@@ -16,7 +16,7 @@ per compute node or one multi-port NIC per node. ...@@ -16,7 +16,7 @@ per compute node or one multi-port NIC per node.
Adding a generic template for building new network models. For simplest case, Adding a generic template for building new network models. For simplest case,
only 2 functions and premable changes should suffice to add a new network. only 2 functions and premable changes should suffice to add a new network.
Updated Express Mesh network model to serve as an example. For details, see Updated Express Mesh network model to serve as an example. For details, see
Darshan workload generator has been updated to use Darshan version 3.x. Darshan workload generator has been updated to use Darshan version 3.x.
...@@ -28,11 +28,11 @@ https://xgitlab.cels.anl.gov/codes/codes/wikis/Using-ROSS-Instrumentation-with-C ...@@ -28,11 +28,11 @@ https://xgitlab.cels.anl.gov/codes/codes/wikis/Using-ROSS-Instrumentation-with-C
Compatible with ROSS version that enables statistics collection of simulation Compatible with ROSS version that enables statistics collection of simulation
performance. For details see: performance. For details see:
http://carothersc.github.io/ROSS/instrumentation/instrumentation.html http://ross-org.github.io/instrumentation/instrumentation.html
Online workload replay functionality has been added that allows SWM workloads Online workload replay functionality has been added that allows SWM workloads
to be simulated insitu on the network models. WIP to integrate Conceptual to be simulated insitu on the network models. WIP to integrate Conceptual
domain specific language for network communication. domain specific language for network communication.
Multiple traffic patterns were added in the background traffic generation Multiple traffic patterns were added in the background traffic generation
including stencil, all-to-all and random permutation. including stencil, all-to-all and random permutation.
...@@ -93,7 +93,7 @@ Background network communication using uniform random workload can now be ...@@ -93,7 +93,7 @@ Background network communication using uniform random workload can now be
generated. The traffic generation gets automatically shut off when the main workload generated. The traffic generation gets automatically shut off when the main workload
finishes. finishes.
Collectives can now be translated into point to point using the CoRTex library. Collectives can now be translated into point to point using the CoRTex library.
Performance of MPI_AllReduce is reported when debug_cols option is enabled. Performance of MPI_AllReduce is reported when debug_cols option is enabled.
......
...@@ -121,7 +121,7 @@ xleftmargin=6ex ...@@ -121,7 +121,7 @@ xleftmargin=6ex
% IEEEtran.cls handling of captions and this will result in nonIEEE style % IEEEtran.cls handling of captions and this will result in nonIEEE style
% figure/table captions. To prevent this problem, be sure and preload % figure/table captions. To prevent this problem, be sure and preload
% caption.sty with its "caption=false" package option. This is will preserve % caption.sty with its "caption=false" package option. This is will preserve
% IEEEtran.cls handing of captions. Version 1.3 (2005/06/28) and later % IEEEtran.cls handing of captions. Version 1.3 (2005/06/28) and later
% (recommended due to many improvements over 1.2) of subfig.sty supports % (recommended due to many improvements over 1.2) of subfig.sty supports
% the caption=false option directly: % the caption=false option directly:
%\usepackage[caption=false,font=footnotesize]{subfig} %\usepackage[caption=false,font=footnotesize]{subfig}
...@@ -188,7 +188,7 @@ easily shared and reused. It also includes a few tips to help avoid common ...@@ -188,7 +188,7 @@ easily shared and reused. It also includes a few tips to help avoid common
simulation bugs. simulation bugs.
For more information, ROSS has a bunch of documentation available in their For more information, ROSS has a bunch of documentation available in their
repository/wiki - see \url{https://github.com/carothersc/ROSS}. repository/wiki - see \url{https://github.com/ross-org/ROSS}.
\end{abstract} \end{abstract}
\section{CODES: modularizing models} \section{CODES: modularizing models}
...@@ -394,7 +394,7 @@ action upon the completion of them. More generally, the problem is: an event ...@@ -394,7 +394,7 @@ action upon the completion of them. More generally, the problem is: an event
issuance (an ack to the client) is based on the completion of more than one issuance (an ack to the client) is based on the completion of more than one
asynchronous/parallel events (local write on primary server, forwarding write to asynchronous/parallel events (local write on primary server, forwarding write to
replica server). Further complicating the matter for storage simulations, there replica server). Further complicating the matter for storage simulations, there
can be any number of outstanding requests, each waiting on multiple events. can be any number of outstanding requests, each waiting on multiple events.
In ROSS's sequential and conservative parallel modes, the necessary state can In ROSS's sequential and conservative parallel modes, the necessary state can
easily be stored in the LP as a queue of statuses for each set of events, easily be stored in the LP as a queue of statuses for each set of events,
...@@ -488,7 +488,7 @@ Most core ROSS examples are design to intentionally hit ...@@ -488,7 +488,7 @@ Most core ROSS examples are design to intentionally hit
the end timestamp for the simulation (i.e. they are modeling a continuous, the end timestamp for the simulation (i.e. they are modeling a continuous,
steady state system). This isn't necessarily true for other models. Quite steady state system). This isn't necessarily true for other models. Quite
simply, set g\_tw\_ts\_end to an arbitrary large number when running simulations simply, set g\_tw\_ts\_end to an arbitrary large number when running simulations
that have a well-defined end-point in terms of events processed. that have a well-defined end-point in terms of events processed.
Within the LP finalize function, do not call tw\_now. The time returned may not Within the LP finalize function, do not call tw\_now. The time returned may not
be consistent in the case of an optimistic simulation. be consistent in the case of an optimistic simulation.
...@@ -515,7 +515,7 @@ section(s). ...@@ -515,7 +515,7 @@ section(s).
\item generating multiple concurrent events makes rollback more difficult \item generating multiple concurrent events makes rollback more difficult
\end{enumerate} \end{enumerate}
\item use dummy events to work around "event-less" advancement of simulation time \item use dummy events to work around "event-less" advancement of simulation time
\item add a small amount of time "noise" to events to prevent ties \item add a small amount of time "noise" to events to prevent ties
......
...@@ -44,7 +44,7 @@ Notes on how to release a new version of CODES ...@@ -44,7 +44,7 @@ Notes on how to release a new version of CODES
4. Upload the release tarball 4. Upload the release tarball
- Our release directory is at ftp.mcs.anl.gov/pub/CODES/releases . There's no - Our release directory is at ftp.mcs.anl.gov/pub/CODES/releases . There's no
web interface, so you have to get onto an MCS workstation and copy the web interface, so you have to get onto an MCS workstation and copy the
release in that way (the ftp server is mounted at /homes/ftp). release in that way (the ftp server is mounted at /mcs/ftp.mcs.anl.gov).
5. Update website 5. Update website
- Project wordpress: http://www.mcs.anl.gov/projects/codes/ (you need - Project wordpress: http://www.mcs.anl.gov/projects/codes/ (you need
......
...@@ -246,7 +246,7 @@ int main( ...@@ -246,7 +246,7 @@ int main(
/* calculate the number of servers in this simulation, /* calculate the number of servers in this simulation,
* ignoring annotations */ * ignoring annotations */
num_servers = codes_mapping_get_lp_count(group_name, 0, "server", NULL, 1); num_servers = codes_mapping_get_lp_count(group_name, 0, "nw-lp", NULL, 1);
/* for this example, we read from a separate configuration group for /* for this example, we read from a separate configuration group for
* server message parameters. Since they are constant for all LPs, * server message parameters. Since they are constant for all LPs,
...@@ -273,7 +273,7 @@ static void svr_add_lp_type() ...@@ -273,7 +273,7 @@ static void svr_add_lp_type()
{ {
/* lp_type_register should be called exactly once per process per /* lp_type_register should be called exactly once per process per
* LP type */ * LP type */
lp_type_register("server", svr_get_lp_type()); lp_type_register("nw-lp", svr_get_lp_type());
} }
static void svr_init( static void svr_init(
......
...@@ -3,14 +3,14 @@ ...@@ -3,14 +3,14 @@
# of application- and codes-specific key-value pairs. # of application- and codes-specific key-value pairs.
LPGROUPS LPGROUPS
{ {
# in our simulation, we simply have a set of servers, each with # in our simulation, we simply have a set of servers (nw-lp), each with
# point-to-point access to each other # point-to-point access to each other
SERVERS SERVERS
{ {
# required: number of times to repeat the following key-value pairs # required: number of times to repeat the following key-value pairs
repetitions="16"; repetitions="16";
# application-specific: parsed in main # application-specific: parsed in main
server="1"; nw-lp="1";
# model-net-specific field defining the network backend. In this example, # model-net-specific field defining the network backend. In this example,
# each server has one NIC, and each server are point-to-point connected # each server has one NIC, and each server are point-to-point connected
modelnet_simplenet="1"; modelnet_simplenet="1";
......
...@@ -18,6 +18,7 @@ argobots_libs=@ARGOBOTS_LIBS@ ...@@ -18,6 +18,7 @@ argobots_libs=@ARGOBOTS_LIBS@
argobots_cflags=@ARGOBOTS_CFLAGS@ argobots_cflags=@ARGOBOTS_CFLAGS@
swm_libs=@SWM_LIBS@ swm_libs=@SWM_LIBS@
swm_cflags=@SWM_CFLAGS@ swm_cflags=@SWM_CFLAGS@
swm_datarootdir=@SWM_DATAROOTDIR@
Name: codes-base Name: codes-base
Description: Base functionality for CODES storage simulation Description: Base functionality for CODES storage simulation
...@@ -25,4 +26,4 @@ Version: @PACKAGE_VERSION@ ...@@ -25,4 +26,4 @@ Version: @PACKAGE_VERSION@
URL: http://trac.mcs.anl.gov/projects/CODES URL: http://trac.mcs.anl.gov/projects/CODES
Requires: Requires:
Libs: -L${libdir} -lcodes ${ross_libs} ${argobots_libs} ${swm_libs} ${darshan_libs} ${dumpi_libs} ${cortex_libs} Libs: -L${libdir} -lcodes ${ross_libs} ${argobots_libs} ${swm_libs} ${darshan_libs} ${dumpi_libs} ${cortex_libs}
Cflags: -I${includedir} ${swm_datarootdir} ${ross_cflags} ${darshan_cflags} ${swm_cflags} ${argobots_cflags} ${dumpi_cflags} ${cortex_cflags} Cflags: -I${includedir} -I${swm_datarootdir} ${ross_cflags} ${darshan_cflags} ${swm_cflags} ${argobots_cflags} ${dumpi_cflags} ${cortex_cflags}
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
# In hindsight this was a lot more complicated than I intended. It was looking to solve a complex problem that turned out to be invalid from the beginning. # In hindsight this was a lot more complicated than I intended. It was looking to solve a complex problem that turned out to be invalid from the beginning.
### USAGE ### ### USAGE ###
# Correct usage: python3 script.py <num_groups> <num_spine_pg> <num_leaf_pg> <router_radix> <num_terminal_per_leaf> <intra-file> <inter-file> # Correct usage: python3 dragonfly-plus-topo-gen-v2.py <router_radix> <num_gc_between_groups> <intra-file> <inter-file>
### ### ### ###
import sys import sys
...@@ -573,37 +573,37 @@ def mainV3(): ...@@ -573,37 +573,37 @@ def mainV3():
print(A.astype(int)) print(A.astype(int))
def mainV2(): # def mainV2():
if(len(argv) < 8): # if(len(argv) < 8):
raise Exception("Correct usage: python %s <num_groups> <num_spine_pg> <num_leaf_pg> <router_radix> <terminals-per-leaf> <intra-file> <inter-file>" % sys.argv[0]) # raise Exception("Correct usage: python %s <num_groups> <num_spine_pg> <num_leaf_pg> <router_radix> <terminals-per-leaf> <intra-file> <inter-file>" % sys.argv[0])
num_groups = int(argv[1]) # num_groups = int(argv[1])
num_spine_pg = int(argv[2]) # num_spine_pg = int(argv[2])
num_leaf_pg = int(argv[3]) # num_leaf_pg = int(argv[3])
router_radix = int(argv[4]) # router_radix = int(argv[4])
term_per_leaf = int(argv[5]) # term_per_leaf = int(argv[5])
intra_filename = argv[6] # intra_filename = argv[6]
inter_filename = argv[7] # inter_filename = argv[7]
parseOptionArguments() # parseOptionArguments()
dfp_network = DragonflyPlusNetwork(num_groups, num_spine_pg, num_leaf_pg, router_radix, num_hosts_per_leaf=term_per_leaf) # dfp_network = DragonflyPlusNetwork(num_groups, num_spine_pg, num_leaf_pg, router_radix, num_hosts_per_leaf=term_per_leaf)
if not DRYRUN: # if not DRYRUN:
dfp_network.writeIntraconnectionFile(intra_filename) # dfp_network.writeIntraconnectionFile(intra_filename)
dfp_network.writeInterconnectionFile(inter_filename) # dfp_network.writeInterconnectionFile(inter_filename)
if LOUDNESS is not Loudness.QUIET: # if LOUDNESS is not Loudness.QUIET:
print("\nNOTE: THIS STILL CAN'T DO THE MED-LARGE TOPOLOGY RIGHT\n") # print("\nNOTE: THIS STILL CAN'T DO THE MED-LARGE TOPOLOGY RIGHT\n")
print(dfp_network.getSummary()) # print(dfp_network.getSummary())
if SHOW_ADJACENCY == 1: # if SHOW_ADJACENCY == 1:
print("\nPrinting Adjacency Matrix:") # print("\nPrinting Adjacency Matrix:")
np.set_printoptions(linewidth=400,threshold=10000,edgeitems=200) # np.set_printoptions(linewidth=400,threshold=10000,edgeitems=200)
A = dfp_network.getAdjacencyMatrix(AdjacencyType.ALL_CONNS) # A = dfp_network.getAdjacencyMatrix(AdjacencyType.ALL_CONNS)
print(A.astype(int)) # print(A.astype(int))
if __name__ == '__main__': if __name__ == '__main__':
mainV3() mainV3()
...@@ -3,14 +3,14 @@ ...@@ -3,14 +3,14 @@
# of application- and codes-specific key-value pairs. # of application- and codes-specific key-value pairs.
LPGROUPS LPGROUPS
{ {
# in our simulation, we simply have a set of servers, each with # in our simulation, we simply have a set of servers (nw-lp), each with
# point-to-point access to each other # point-to-point access to each other
SERVERS SERVERS
{ {
# required: number of times to repeat the following key-value pairs # required: number of times to repeat the following key-value pairs
repetitions="C_NUM_SERVERS"; repetitions="C_NUM_SERVERS";
# application-specific: parsed in main # application-specific: parsed in main
server="1"; nw-lp="1";
# model-net-specific field defining the network backend. In this example, # model-net-specific field defining the network backend. In this example,
# each server has one NIC, and each server are point-to-point connected # each server has one NIC, and each server are point-to-point connected
modelnet_simplenet="1"; modelnet_simplenet="1";
......
...@@ -41,7 +41,7 @@ PARAMS ...@@ -41,7 +41,7 @@ PARAMS
# bandwidth in GiB/s for compute node-router channels # bandwidth in GiB/s for compute node-router channels
cn_bandwidth="16.0"; cn_bandwidth="16.0";
# ROSS message size # ROSS message size
message_size="656"; message_size="736";
# number of compute nodes connected to router, dictated by dragonfly config # number of compute nodes connected to router, dictated by dragonfly config
# file # file
num_cns_per_router="2"; num_cns_per_router="2";
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="198"; repetitions="198";
server="384"; nw-lp="384";
modelnet_fattree="24"; modelnet_fattree="24";
fattree_switch="6"; fattree_switch="6";
} }
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="252"; repetitions="252";
server="288"; nw-lp="288";
modelnet_fattree="18"; modelnet_fattree="18";
fattree_switch="6"; fattree_switch="6";
} }
......
...@@ -23,6 +23,6 @@ PARAMS ...@@ -23,6 +23,6 @@ PARAMS
local_bandwidth="5.25"; local_bandwidth="5.25";
global_bandwidth="4.7"; global_bandwidth="4.7";
cn_bandwidth="5.25"; cn_bandwidth="5.25";
message_size="656"; message_size="736";
routing="adaptive"; routing="adaptive";
} }
...@@ -12,7 +12,7 @@ PARAMS ...@@ -12,7 +12,7 @@ PARAMS
{ {
ft_type="0"; ft_type="0";
packet_size="512"; packet_size="512";
message_size="592"; message_size="736";
chunk_size="512"; chunk_size="512";
modelnet_scheduler="fcfs"; modelnet_scheduler="fcfs";
#modelnet_scheduler="round-robin"; #modelnet_scheduler="round-robin";
......
...@@ -31,6 +31,6 @@ PARAMS ...@@ -31,6 +31,6 @@ PARAMS
cn_bandwidth="9.0"; cn_bandwidth="9.0";
router_delay="0"; router_delay="0";
link_delay="0"; link_delay="0";
message_size="656"; message_size="736";
routing="minimal"; routing="minimal";
} }
...@@ -10,7 +10,7 @@ LPGROUPS ...@@ -10,7 +10,7 @@ LPGROUPS
PARAMS PARAMS
{ {
packet_size="512"; packet_size="512";
message_size="656"; message_size="736";
modelnet_order=( "torus" ); modelnet_order=( "torus" );
# scheduler options # scheduler options
modelnet_scheduler="fcfs"; modelnet_scheduler="fcfs";
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="264"; repetitions="264";
server="4"; nw-lp="4";
modelnet_dragonfly="4"; modelnet_dragonfly="4";
modelnet_dragonfly_router="1"; modelnet_dragonfly_router="1";
} }
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="198"; # repetitions = Ne = total # of edge switches. For type0 Ne = Np*Ns = ceil(N/Ns*(k/2))*(k/2) = ceil(N/(k/2)^2)*(k/2) repetitions="198"; # repetitions = Ne = total # of edge switches. For type0 Ne = Np*Ns = ceil(N/Ns*(k/2))*(k/2) = ceil(N/(k/2)^2)*(k/2)
server="18"; nw-lp="18";
modelnet_fattree="18"; modelnet_fattree="18";
fattree_switch="3"; fattree_switch="3";
} }
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="32"; # repetitions = Ne = total # of edge switches. For type0 Ne = Np*Ns = ceil(N/Ns*(k/2))*(k/2) = ceil(N/(k/2)^2)*(k/2) repetitions="32"; # repetitions = Ne = total # of edge switches. For type0 Ne = Np*Ns = ceil(N/Ns*(k/2))*(k/2) = ceil(N/(k/2)^2)*(k/2)
server="4"; nw-lp="4";
modelnet_fattree="4"; modelnet_fattree="4";
fattree_switch="3"; fattree_switch="3";
} }
......
...@@ -3,7 +3,7 @@ LPGROUPS ...@@ -3,7 +3,7 @@ LPGROUPS
MODELNET_GRP MODELNET_GRP
{ {
repetitions="50"; repetitions="50";
server="3"; nw-lp="3";
modelnet_slimfly="3"; modelnet_slimfly="3";
slimfly_router="1"; slimfly_router="1";
} }
......
...@@ -165,7 +165,7 @@ static const st_model_types *ft_svr_get_model_stat_types(void) ...@@ -165,7 +165,7 @@ static const st_model_types *ft_svr_get_model_stat_types(void)
void ft_svr_register_model_stats() void ft_svr_register_model_stats()
{ {
st_model_type_register("server", ft_svr_get_model_stat_types()); st_model_type_register("nw-lp", ft_svr_get_model_stat_types());
} }
const tw_optdef app_opt [] = const tw_optdef app_opt [] =
...@@ -184,7 +184,7 @@ const tw_lptype* svr_get_lp_type() ...@@ -184,7 +184,7 @@ const tw_lptype* svr_get_lp_type()
static void svr_add_lp_type() static void svr_add_lp_type()
{ {
lp_type_register("server", svr_get_lp_type()); lp_type_register("nw-lp", svr_get_lp_type());
} }
static void issue_event( static void issue_event(
...@@ -477,7 +477,7 @@ int main( ...@@ -477,7 +477,7 @@ int main(
MPI_Finalize(); MPI_Finalize();
return 0; return 0;
} }
num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "server", num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "nw-lp",
NULL, 1); NULL, 1);
configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp); configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp);
...@@ -485,7 +485,7 @@ int main( ...@@ -485,7 +485,7 @@ int main(
num_nodes = num_groups * num_routers_per_grp * (num_routers_per_grp / 2); num_nodes = num_groups * num_routers_per_grp * (num_routers_per_grp / 2);
num_nodes_per_grp = num_routers_per_grp * (num_routers_per_grp / 2); num_nodes_per_grp = num_routers_per_grp * (num_routers_per_grp / 2);
num_nodes = codes_mapping_get_lp_count("MODELNET_GRP", 0, "server", NULL, 1); num_nodes = codes_mapping_get_lp_count("MODELNET_GRP", 0, "nw-lp", NULL, 1);
printf("num_nodes:%d \n",num_nodes); printf("num_nodes:%d \n",num_nodes);
......
...@@ -160,7 +160,7 @@ static const st_model_types *svr_get_model_stat_types(void) ...@@ -160,7 +160,7 @@ static const st_model_types *svr_get_model_stat_types(void)
void svr_register_model_types() void svr_register_model_types()
{ {
st_model_type_register("server", svr_get_model_stat_types()); st_model_type_register("nw-lp", svr_get_model_stat_types());
} }
const tw_optdef app_opt [] = const tw_optdef app_opt [] =
...@@ -181,7 +181,7 @@ const tw_lptype* svr_get_lp_type() ...@@ -181,7 +181,7 @@ const tw_lptype* svr_get_lp_type()
static void svr_add_lp_type() static void svr_add_lp_type()
{ {
lp_type_register("server", svr_get_lp_type()); lp_type_register("nw-lp", svr_get_lp_type());
} }
/* convert GiB/s and bytes to ns */ /* convert GiB/s and bytes to ns */
...@@ -523,7 +523,7 @@ int main( ...@@ -523,7 +523,7 @@ int main(
net_id = *net_ids; net_id = *net_ids;
free(net_ids); free(net_ids);
num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "server", NULL, 1); num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "nw-lp", NULL, 1);
configuration_get_value_int(&config, "PARAMS", "num_terminals", NULL, &num_terminals); configuration_get_value_int(&config, "PARAMS", "num_terminals", NULL, &num_terminals);
configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp); configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp);
num_groups = (num_routers_per_grp * 2); num_groups = (num_routers_per_grp * 2);
......
...@@ -147,7 +147,7 @@ static const st_model_types *svr_get_model_stat_types(void) ...@@ -147,7 +147,7 @@ static const st_model_types *svr_get_model_stat_types(void)
void svr_register_model_types() void svr_register_model_types()
{ {
st_model_type_register("server", svr_get_model_stat_types()); st_model_type_register("nw-lp", svr_get_model_stat_types());
} }
const tw_optdef app_opt [] = const tw_optdef app_opt [] =
...@@ -170,7 +170,7 @@ const tw_lptype* svr_get_lp_type() ...@@ -170,7 +170,7 @@ const tw_lptype* svr_get_lp_type()
static void svr_add_lp_type() static void svr_add_lp_type()
{ {
lp_type_register("server", svr_get_lp_type()); lp_type_register("nw-lp", svr_get_lp_type());
} }
static void issue_event( static void issue_event(
...@@ -448,7 +448,7 @@ int main( ...@@ -448,7 +448,7 @@ int main(
MPI_Finalize(); MPI_Finalize();
return 0; return 0;
} }
num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "server", num_servers_per_rep = codes_mapping_get_lp_count("MODELNET_GRP", 1, "nw-lp",
NULL, 1); NULL, 1);
configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp); configuration_get_value_int(&config, "PARAMS", "num_routers", NULL, &num_routers_per_grp);
......
...@@ -58,7 +58,7 @@ p*a*g=8*4*33). This configuration can be specified in the config file in the fol ...@@ -58,7 +58,7 @@ p*a*g=8*4*33). This configuration can be specified in the config file in the fol
MODELNET_GRP MODELNET_GRP
{ {
repetitions="264"; repetitions="264";
server="4"; nw-lp="4";
modelnet_dragonfly="4"; modelnet_dragonfly="4";
dragonfly_router="1"; dragonfly_router="1";
} }
...@@ -70,7 +70,7 @@ PARAMS ...@@ -70,7 +70,7 @@ PARAMS
} }
The first section, MODELNET_GRP specified the number of LPs and the layout of The first section, MODELNET_GRP specified the number of LPs and the layout of
LPs. In the above case, there are 264 repetitions of 4 server LPs, 4 dragonfly LPs. In the above case, there are 264 repetitions of 4 server LPs (nw-lp), 4 dragonfly
network node LPs and 1 dragonfly router LP, which makes a total of 264 routers, network node LPs and 1 dragonfly router LP, which makes a total of 264 routers,
1056 network nodes and 1056 servers in the network. The second section, PARAMS 1056 network nodes and 1056 servers in the network. The second section, PARAMS
uses 'num_routers' for the dragonfly topology lay out and setsup the uses 'num_routers' for the dragonfly topology lay out and setsup the
......
...@@ -8,7 +8,7 @@ bare-minimum example config file: ...@@ -8,7 +8,7 @@ bare-minimum example config file:
MODELNET_GRP MODELNET_GRP
{ {
repetitions="12"; repetitions="12";
server="4"; nw-lp="4";
modelnet_fattree="4"; modelnet_fattree="4";
fattree_switch="3"; fattree_switch="3";
} }
...@@ -24,7 +24,7 @@ PARAMS ...@@ -24,7 +24,7 @@ PARAMS
The first section, MODELNET_GRP specifies the LP types, number of LPs per type The first section, MODELNET_GRP specifies the LP types, number of LPs per type
and their configuration. In the above case, there are 12 repetitions each with 4 and their configuration. In the above case, there are 12 repetitions each with 4
server LPs, 4 fat tree network node/terminal LPs and 3 fat tree switch LPs. Each server LPs (nw-lp), 4 fat tree network node/terminal LPs and 3 fat tree switch LPs. Each
repetition represents a leaf level switch, nodes connected to it, and higher repetition represents a leaf level switch, nodes connected to it, and higher
level switches that may be needed to construct the fat-tree. The level switches that may be needed to construct the fat-tree. The
'fattree_switch' parameter indicates there are 3 levels to this fat tree and 'fattree_switch' parameter indicates there are 3 levels to this fat tree and
......
...@@ -73,7 +73,7 @@ generator sets. This configuration can be specified in the config file in the fo ...@@ -73,7 +73,7 @@ generator sets. This configuration can be specified in the config file in the fo
MODELNET_GRP MODELNET_GRP
{ {
repetitions="50"; repetitions="50";
server="3"; nw-lp="3";
modelnet_slimfly="3"; modelnet_slimfly="3";
slimfly_router="1"; slimfly_router="1";
} }
...@@ -88,7 +88,7 @@ PARAMS ...@@ -88,7 +88,7 @@ PARAMS
} }
The first section, MODELNET_GRP specified the number of LPs and the layout of The first section, MODELNET_GRP specified the number of LPs and the layout of
LPs. In the above case, there are 50 repetitions of 3 server LPs, 3 slim fly LPs. In the above case, there are 50 repetitions of 3 server LPs (nw-lp), 3 slim fly
network node LPs and 1 slim fly router LP, which makes a total of 50 routers, network node LPs and 1 slim fly router LP, which makes a total of 50 routers,
150 network nodes and 150 servers in the network. The second section, PARAMS 150 network nodes and 150 servers in the network. The second section, PARAMS
uses 'num_routers' for the slim fly topology lay out and sets up the uses 'num_routers' for the slim fly topology lay out and sets up the
......
...@@ -415,6 +415,8 @@ struct router_state ...@@ -415,6 +415,8 @@ struct router_state
tw_stime* busy_time; tw_stime* busy_time;
tw_stime* busy_time_sample; tw_stime* busy_time_sample;
unsigned long* stalled_chunks; //Counter for when a packet is put into queued messages instead of routing
terminal_dally_message_list ***pending_msgs; terminal_dally_message_list ***pending_msgs;
terminal_dally_message_list ***pending_msgs_tail; terminal_dally_message_list ***pending_msgs_tail;
terminal_dally_message_list ***queued_msgs; terminal_dally_message_list ***queued_msgs;
...@@ -1538,7 +1540,9 @@ void router_dally_setup(router_state * r, tw_lp * lp) ...@@ -1538,7 +1540,9 @@ void router_dally_setup(router_state * r, tw_lp * lp)
r->link_traffic_sample = (int64_t*)calloc(p->radix, sizeof(int64_t)); r->link_traffic_sample = (int64_t*)calloc(p->radix, sizeof(int64_t));
r->cur_hist_num = (int*)calloc(p->radix, sizeof(int)); r->cur_hist_num = (int*)calloc(p->radix, sizeof(int));
r->prev_hist_num = (int*)calloc(p->radix, sizeof(int)); r->prev_hist_num = (int*)calloc(p->radix, sizeof(int));
r->stalled_chunks = (unsigned long*)calloc(p->radix, sizeof(unsigned long));
r->last_sent_chan = (int*) calloc(p->num_router_rows, sizeof(int)); r->last_sent_chan = (int*) calloc(p->num_router_rows, sizeof(int));
r->vc_occupancy = (int**)calloc(p->radix , sizeof(int*)); r->vc_occupancy = (int**)calloc(p->radix , sizeof(int*));
r->in_send_loop = (int*)calloc(p->radix, sizeof(int)); r->in_send_loop = (int*)calloc(p->radix, sizeof(int));
...@@ -1781,9 +1785,14 @@ static void packet_generate_rc(terminal_state * s, tw_bf * bf, terminal_dally_me ...@@ -1781,9 +1785,14 @@ static void packet_generate_rc(terminal_state * s, tw_bf * bf, terminal_dally_me
if(bf->c5) { if(bf->c5) {
s->in_send_loop = 0; s->in_send_loop = 0;
} }
if(bf->c11) {
if (bf->c11) {
s->issueIdle = 0; s->issueIdle = 0;
}
if(bf->c8) {
s->last_buf_full = msg->saved_busy_time;
}
}
struct mn_stats* stat; struct mn_stats* stat;
stat = model_net_find_stats(msg->category, s->dragonfly_stats_array); stat = model_net_find_stats(msg->category, s->dragonfly_stats_array);
stat->send_count--; stat->send_count--;
...@@ -1915,6 +1924,15 @@ static void packet_generate(terminal_state * s, tw_bf * bf, terminal_dally_messa ...@@ -1915,6 +1924,15 @@ static void packet_generate(terminal_state * s, tw_bf * bf, terminal_dally_messa
} else { } else {
bf->c11 = 1; bf->c11 = 1;
s->issueIdle = 1; s->issueIdle = 1;
//this block was missing from when QOS was added - readded 5-21-19
if(s->last_buf_full == 0.0)
{
bf->c8 = 1;
msg->saved_busy_time = s->last_buf_full;
/* TODO: Assumes a single vc from terminal to router */
s->last_buf_full = tw_now(lp);
}
} }
if(s->in_send_loop == 0) { if(s->in_send_loop == 0) {
...@@ -2179,8 +2197,7 @@ static void packet_send(terminal_state * s, tw_bf * bf, terminal_dally_message * ...@@ -2179,8 +2197,7 @@ static void packet_send(terminal_state * s, tw_bf * bf, terminal_dally_message *
msg->num_rngs = 0; msg->num_rngs = 0;
msg->num_cll = 0; msg->num_cll = 0;
if(num_qos_levels > 1) vcg = get_next_vcg(s, bf, msg, lp);
vcg = get_next_vcg(s, bf, msg, lp);
/* For a terminal to router connection, there would be as many VCGs as number /* For a terminal to router connection, there would be as many VCGs as number
* of VCs*/ * of VCs*/
...@@ -2312,6 +2329,7 @@ static void packet_send(terminal_state * s, tw_bf * bf, terminal_dally_message * ...@@ -2312,6 +2329,7 @@ static void packet_send(terminal_state * s, tw_bf * bf, terminal_dally_message *
s->busy_time += (tw_now(lp) - s->last_buf_full); s->busy_time += (tw_now(lp) - s->last_buf_full);
s->busy_time_sample += (tw_now(lp) - s->last_buf_full); s->busy_time_sample += (tw_now(lp) - s->last_buf_full);
s->ross_sample.busy_time_sample += (tw_now(lp) - s->last_buf_full); s->ross_sample.busy_time_sample += (tw_now(lp) - s->last_buf_full);
msg->saved_busy_time_ross = s->busy_time_ross_sample;
s->busy_time_ross_sample += (tw_now(lp) - s->last_buf_full); s->busy_time_ross_sample += (tw_now(lp) - s->last_buf_full);
s->last_buf_full = 0.0; s->last_buf_full = 0.0;
} }
...@@ -3136,11 +3154,11 @@ dragonfly_dally_terminal_final( terminal_state * s, ...@@ -3136,11 +3154,11 @@ dragonfly_dally_terminal_final( terminal_state * s,
if(s->terminal_id == 0) if(s->terminal_id == 0)
{ {
written += sprintf(s->output_buf + written, "# Format <source_id> <source_type> <dest_id> < dest_type> <link_type> <link_traffic> <link_saturation>"); written += sprintf(s->output_buf + written, "# Format <source_id> <source_type> <dest_id> < dest_type> <link_type> <link_traffic> <link_saturation> <stalled_chunks>");
// fprintf(fp, "# Format <LP id> <Terminal ID> <Total Data Size> <Avg packet latency> <# Flits/Packets finished> <Avg hops> <Busy Time> <Max packet Latency> <Min packet Latency >\n"); // fprintf(fp, "# Format <LP id> <Terminal ID> <Total Data Size> <Avg packet latency> <# Flits/Packets finished> <Avg hops> <Busy Time> <Max packet Latency> <Min packet Latency >\n");
} }
written += sprintf(s->output_buf + written, "\n%u %s %llu %s %s %llu %lf", written += sprintf(s->output_buf + written, "\n%u %s %llu %s %s %llu %lf %d",
s->terminal_id, "T", s->router_id, "R", "CN", LLU(s->total_msg_size), s->busy_time); s->terminal_id, "T", s->router_id, "R", "CN", LLU(s->total_msg_size), s->busy_time, -1); //note that terminals don't have stalled chuncks because of model net scheduling only gives a terminal what it can handle (-1 to show N/A)
lp_io_write(lp->gid, (char*)"dragonfly-link-stats", written, s->output_buf); lp_io_write(lp->gid, (char*)"dragonfly-link-stats", written, s->output_buf);
...@@ -3211,36 +3229,40 @@ void dragonfly_dally_router_final(router_state * s, ...@@ -3211,36 +3229,40 @@ void dragonfly_dally_router_final(router_state * s,
if(d != src_rel_id) if(d != src_rel_id)
{ {
int dest_ab_id = local_grp_id * p->num_routers + d; int dest_ab_id = local_grp_id * p->num_routers + d;
written += sprintf(s->output_buf + written, "\n%d %s %d %s %s %llu %lf", written += sprintf(s->output_buf + written, "\n%d %s %d %s %s %llu %lf %lu",
s->router_id, s->router_id,
"R", "R",
dest_ab_id, dest_ab_id,
"R", "R",
"L", "L",
s->link_traffic[d], s->link_traffic[d],
s->busy_time[d]); s->busy_time[d],
s->stalled_chunks[d]);
} }
} }
map< int, vector<bLink> > &curMap = interGroupLinks[s->router_id];
map< int, vector<bLink> >::iterator it = curMap.begin(); map< int, vector<bLink> > &curMap = interGroupLinks[s->router_id];
for(; it != curMap.end(); it++) map< int, vector<bLink> >::iterator it = curMap.begin();
{ for(; it != curMap.end(); it++)
/* TODO: Works only for single global connections right now. Make it functional {
* for a 2-D dragonfly. */ /* TODO: Works only for single global connections right now. Make it functional
for(int l = 0; l < it->second.size(); l++) { * for a 2-D dragonfly. */
int dest_rtr_id = it->second[l].dest; for(int l = 0; l < it->second.size(); l++) {
int offset = it->second[l].offset; int dest_rtr_id = it->second[l].dest;
assert(offset >= 0 && offset < p->num_global_channels); int offset = it->second[l].offset;
written += sprintf(s->output_buf + written, "\n%d %s %d %s %s %llu %lf", assert(offset >= 0 && offset < p->num_global_channels);
s->router_id, written += sprintf(s->output_buf + written, "\n%d %s %d %s %s %llu %lf %lu",
"R", s->router_id,
dest_rtr_id, "R",
"R", dest_rtr_id,
"G", "R",
s->link_traffic[offset], "G",
s->busy_time[offset]); s->link_traffic[offset],
} s->busy_time[offset],
s->stalled_chunks[offset]);
} }
}
sprintf(s->output_buf + written, "\n"); sprintf(s->output_buf + written, "\n");
lp_io_write(lp->gid, (char*)"dragonfly-link-stats", written, s->output_buf); lp_io_write(lp->gid, (char*)"dragonfly-link-stats", written, s->output_buf);
...@@ -4012,6 +4034,11 @@ static void router_packet_receive_rc(router_state * s, ...@@ -4012,6 +4034,11 @@ static void router_packet_receive_rc(router_state * s,
} }
} }
if(bf->c4) { if(bf->c4) {
s->stalled_chunks[output_port]--;
if(bf->c22)
{
s->last_buf_full[output_port] = msg->saved_busy_time;
}
delete_terminal_dally_message_list(return_tail(s->queued_msgs[output_port], delete_terminal_dally_message_list(return_tail(s->queued_msgs[output_port],
s->queued_msgs_tail[output_port], output_chan)); s->queued_msgs_tail[output_port], output_chan));
s->queued_count[output_port] -= s->params->chunk_size; s->queued_count[output_port] -= s->params->chunk_size;
...@@ -4282,12 +4309,27 @@ if(cur_chunk->msg.path_type == NON_MINIMAL) ...@@ -4282,12 +4309,27 @@ if(cur_chunk->msg.path_type == NON_MINIMAL)
} else { } else {
bf->c4 = 1; bf->c4 = 1;
s->stalled_chunks[output_port]++;
cur_chunk->msg.saved_vc = msg->vc_index; cur_chunk->msg.saved_vc = msg->vc_index;
cur_chunk->msg.saved_channel = msg->output_chan; cur_chunk->msg.saved_channel = msg->output_chan;
assert(output_chan < s->params->num_vcs && output_port < s->params->radix); assert(output_chan < s->params->num_vcs && output_port < s->params->radix);
append_to_terminal_dally_message_list( s->queued_msgs[output_port], append_to_terminal_dally_message_list( s->queued_msgs[output_port],
s->queued_msgs_tail[output_port], output_chan, cur_chunk); s->queued_msgs_tail[output_port], output_chan, cur_chunk);
s->queued_count[output_port] += s->params->chunk_size; s->queued_count[output_port] += s->params->chunk_size;
//THIS WAS REMOVED WHEN QOS WAS INSTITUTED - READDED 5/20/19
/* a check for pending msgs is non-empty then we dont set anything. If
* that is empty then we check if last_buf_full is set or not. If already
* set then we don't overwrite it. If two packets arrive next to each other
* then the first person should be setting it. */
if(s->pending_msgs[output_port][output_chan] == NULL && s->last_buf_full[output_port] == 0.0)
{
bf->c22 = 1;
msg->saved_busy_time = s->last_buf_full[output_port];
s->last_buf_full[output_port] = tw_now(lp);