Commit 4a4dafb7 authored by Jonathan Jenkins's avatar Jonathan Jenkins

Mostly complete GETTING_STARTED guide.

Perhaps this should just be renamed USER_GUIDE?
parent f41cd6e4
This is an outline for the getting started document in a roughly asciidoc
format.
= Resources/links
ROSS site + repo
CODES site + repo + trac
mailing lists
= CODES/ROSS resources
CODES and ROSS share a mailing list. It is at:
https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users
== CODES
* main site: http://www.mcs.anl.gov/projects/codes/
* repositories:
* "base" (this repository): git.mcs.anl.gov:radix/codes-base
* codes-net (networking support): git.mcs.anl.gov:radix/codes-net
* bug tracking: https://trac.mcs.anl.gov/projects/CODES
== ROSS
* main site, repository, etc.: https://github.com/carothersc/ROSS
= Components of CODES
== configuration
The configuration of LPs, LP parameterization, and miscellaneous simulation
parameters are specified by the CODES configuration system, which uses a
structured configuration file. The configuration format allows categories, and
optionally subgroups within the category, of key-value pairs for configuration.
The LPGROUPS category defines the LP configuration. The PARAMS category is
currently used for networking and ROSS-specific parameters. User-defined
categories can also be used.
The configuration system additionally allows LP specialization via the usage of
"annotations". This allows two otherwise identical LPs to have different
parameterizations. Annotations have a simple "@" syntax appended to the LP
fields, and are optional.
The API is located at codes/configuration.h, which provides various types of
access into the simulation configuration. Detailed configuration files can be
found at doc/example/example.conf and doc/example_heterogeneous/example.conf.
== LP mapping
The codes-mapping API maps user LPs to global LP IDs, providing numerous
options for modulating the namespace under which the mapping is conducted.
Mapping is performed on a per-group or per-LP-type basis, with numerous further
filtering options including on an LPs annotation. Finally, the mapping API
provides LP counts using the aforementioned filtering options.
The API can be found at codes/codes_mapping.h. doc/example/example.c shows a
simple example of the mapping functionality, while the test program
tests/mapping_test.c with configuration file tests/conf/mapping_test.conf
exhaustively demonstrate the mapping API.
== workload generator(s)
codes-workload is an in-development abstraction layer for feeding I/O / network
workloads into a simulation. It supports multiple back-ends for generating I/O
and network events; data could come from a trace file, from Darshan, or from a
synthetic description.
The workload generator is currently a work in progress, and the API is subject
to change. We currently have standalone IO and network workload generators, the
former exposing a "POSIX-ish" open/close/read/write interface, and the latter
exposing an "MPI-ish" send/recv/barrier/collective interface. In the future, we
will be unifying these generators.
As an additional utility, we provide a simple debug program,
src/workload/codes-workload-dump, that processes the workload and prints to
standard out.
=== IO
==== Darshan
We currently have initial support for extrapolating (lossy) Darshan logs
(https://www.mcs.anl.gov/research/projects/darshan/), a simple synthetic
IO kernel language, and in-development ScalaTrace
(http://moss.csc.ncsu.edu/~mueller/ScalaTrace/) and IO Recorder
(https://github.com/babakbehzad/Recorder) traces.
==== Synthetic IO language
==== Synthetic IO language (TODO)
==== ...
=== Network (TODO)
=== Network
== LP-IO
==== ...
LP-IO is a set of simple reverse-computation-aware routines for conditionally
outputting data on a per-LP basis. As the focus is on convenient, small-scale
data output, data written via LP-IO remains in memory until the end of the
simulation, or freed upon reverse computation. Large-scale,
reverse-computation-aware IO is a feature we're thinking about for future
usage.
=== Darshan
The API can be found at codes/lp-io.h and is fairly self-explanatory.
=== synthetic IO
== CODES configurator
== LP-IO
The configurator is a set of scripts intended to make the auto-generation of
multiple CODES configuration files easier, for the purposes of performing
parameter sweeps of simulations. The configuration file defining the parameters
in the parameter sweep is defined by a python source file with well-defined
field names, to maximize flexibility and enable some essential features for
flexible parameter sweeps (disabling certain combinations of parameters,
deriving parameters from other parameters in the sweep). The actual replacement
is driven by token replacement defined by the values in the configuraiton file.
== miscellaneous utilities
An exhaustive example can be found at scripts/example. The scripts themselves
are codes_configurator.py, codes_filter_configs.py, and
codes_config_get_vals.py, each with detailed usage info. These scripts have
heavily-overlapping functionality, so in the future these may be merged.
=== generic message header (see best practices)
== miscellaneous utilities
=== lp template (src/util/templates)
As writing ROSS/CODES models currently entail a not-insignificant amount of
boilerplate for defining LPs and hooking them into ROSS, we have a template
model for use at src/util/templates/lp_template.* .
=== generic message header (see best practices)
We recommend the use of codes/lp-msg.h to standardize LP event headers, making it
easier to identify messages.
= Utility models
== local storage model
The local storage model (LSM) is fairly simple in design but is sufficient for
many simulations with reasonable I/O access patterns. It is an
overhead/latency/bandwidth model that tracks file and offset accesses to
determine whether to apply seeking penalties to the performance of the access.
It uses a simple histogram-based approach to parameterization:
overhead/latency/bandwidth numbers are given relative to different access
sizes. To gather such parameters, well-known I/O benchmarks such as fio
(http://git.kernel.dk/?p=fio.git;a=summary) can be used.
The LP name used in configuration is "lsm" and the configuration is expected to
be in a similarly named standalone group, an example of which is shown below:
lsm
{
# in bytes
request_sizes = ( "4096","8192","16384","32768","65536","131072","262144","524288","1048576","2097152","4194304" );
# in MiB/s (2^20 bytes / s)
write_rates = ( "1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7" );
read_rates = ( "1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1" );
# in microseconds
write_seeks = ( "499.5","509.0","514.7","525.9","546.4","588.3","663.1","621.8","539.1","3179.5","6108.8" );
read_seeks = ( "3475.6","3470.0","3486.2","3531.2","3608.6","3741.0","3988.9","4530.2","5644.2","7922.0","11700.3" );
write_overheads = ( "29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67" );
read_overheads = ( "23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67" );
}
The API can be found at codes/local-storage-model.h and example usage can be
seen in tests/local-storage-model-test.c and tests/conf/lsm-test.conf.
== resource model
The resource model presents a simple integer counter representing some finite
resource (e.g., bytes of memory available). LPs request some number of units of
the resource, receiving a success/failure completion message via a callback
mechanism. Optional "blocking" can be used to defer the completion message
until the request is successfully completed.
The configuration LP name is "resource" and the parameters are given in a
similarly-named group. An example is shown below:
resource
{
available="8192";
}
The API for the underlying resource data structure can be found in
codes/resource.h. The user-facing API for communicating with the LP can be
found in codes/resource-lp.h.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment