GETTING_STARTED 7.56 KB
Newer Older
Jonathan Jenkins's avatar
Jonathan Jenkins committed
1
2
3
This is an outline for the getting started document in a roughly asciidoc
format.

4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
= CODES/ROSS resources

CODES and ROSS share a mailing list. It is at:
https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users

== CODES

* main site: http://www.mcs.anl.gov/projects/codes/ 
* repositories:
  * "base" (this repository): git.mcs.anl.gov:radix/codes-base
  * codes-net (networking support): git.mcs.anl.gov:radix/codes-net
* bug tracking: https://trac.mcs.anl.gov/projects/CODES

== ROSS

* main site, repository, etc.: https://github.com/carothersc/ROSS
Jonathan Jenkins's avatar
Jonathan Jenkins committed
20
21
22
23
24

= Components of CODES 

== configuration

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
The configuration of LPs, LP parameterization, and miscellaneous simulation
parameters are specified by the CODES configuration system, which uses a
structured configuration file. The configuration format allows categories, and
optionally subgroups within the category, of key-value pairs for configuration.
The LPGROUPS category defines the LP configuration. The PARAMS category is
currently used for networking and ROSS-specific parameters. User-defined
categories can also be used. 

The configuration system additionally allows LP specialization via the usage of
"annotations". This allows two otherwise identical LPs to have different
parameterizations. Annotations have a simple "@" syntax appended to the LP
fields, and are optional.

The API is located at codes/configuration.h, which provides various types of
access into the simulation configuration. Detailed configuration files can be
found at doc/example/example.conf and doc/example_heterogeneous/example.conf.

Jonathan Jenkins's avatar
Jonathan Jenkins committed
42
43
== LP mapping

44
45
46
47
48
49
50
51
52
53
54
The codes-mapping API maps user LPs to global LP IDs, providing numerous
options for modulating the namespace under which the mapping is conducted.
Mapping is performed on a per-group or per-LP-type basis, with numerous further
filtering options including on an LPs annotation. Finally, the mapping API
provides LP counts using the aforementioned filtering options.

The API can be found at codes/codes_mapping.h. doc/example/example.c shows a
simple example of the mapping functionality, while the test program
tests/mapping_test.c with configuration file tests/conf/mapping_test.conf
exhaustively demonstrate the mapping API.

Jonathan Jenkins's avatar
Jonathan Jenkins committed
55
56
== workload generator(s)

57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
codes-workload is an in-development abstraction layer for feeding I/O / network
workloads into a simulation. It supports multiple back-ends for generating I/O
and network events; data could come from a trace file, from Darshan, or from a
synthetic description.

The workload generator is currently a work in progress, and the API is subject
to change. We currently have standalone IO and network workload generators, the
former exposing a "POSIX-ish" open/close/read/write interface, and the latter
exposing an "MPI-ish" send/recv/barrier/collective interface. In the future, we
will be unifying these generators.

As an additional utility, we provide a simple debug program,
src/workload/codes-workload-dump, that processes the workload and prints to
standard out.

Jonathan Jenkins's avatar
Jonathan Jenkins committed
72
73
=== IO

74
75
76
77
78
We currently have initial support for extrapolating (lossy) Darshan logs
(https://www.mcs.anl.gov/research/projects/darshan/), a simple synthetic
IO kernel language, and in-development ScalaTrace
(http://moss.csc.ncsu.edu/~mueller/ScalaTrace/) and IO Recorder
(https://github.com/babakbehzad/Recorder) traces.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
79

80
==== Synthetic IO language (TODO)
Jonathan Jenkins's avatar
Jonathan Jenkins committed
81

82
=== Network (TODO)
Jonathan Jenkins's avatar
Jonathan Jenkins committed
83

84
== LP-IO
Jonathan Jenkins's avatar
Jonathan Jenkins committed
85

86
87
88
89
90
91
LP-IO is a set of simple reverse-computation-aware routines for conditionally
outputting data on a per-LP basis. As the focus is on convenient, small-scale
data output, data written via LP-IO remains in memory until the end of the
simulation, or freed upon reverse computation. Large-scale,
reverse-computation-aware IO is a feature we're thinking about for future
usage.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
92

93
The API can be found at codes/lp-io.h and is fairly self-explanatory.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
94

95
== CODES configurator
Jonathan Jenkins's avatar
Jonathan Jenkins committed
96

97
98
99
100
101
102
103
104
The configurator is a set of scripts intended to make the auto-generation of
multiple CODES configuration files easier, for the purposes of performing
parameter sweeps of simulations. The configuration file defining the parameters
in the parameter sweep is defined by a python source file with well-defined
field names, to maximize flexibility and enable some essential features for
flexible parameter sweeps (disabling certain combinations of parameters,
deriving parameters from other parameters in the sweep). The actual replacement
is driven by token replacement defined by the values in the configuraiton file.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
105

106
107
108
109
An exhaustive example can be found at scripts/example. The scripts themselves
are codes_configurator.py, codes_filter_configs.py, and
codes_config_get_vals.py, each with detailed usage info. These scripts have
heavily-overlapping functionality, so in the future these may be merged.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
110

111
== miscellaneous utilities
Jonathan Jenkins's avatar
Jonathan Jenkins committed
112
113
114

=== lp template (src/util/templates)

115
116
117
118
119
120
121
122
As writing ROSS/CODES models currently entail a not-insignificant amount of
boilerplate for defining LPs and hooking them into ROSS, we have a template
model for use at src/util/templates/lp_template.* .

=== generic message header (see best practices)

We recommend the use of codes/lp-msg.h to standardize LP event headers, making it
easier to identify messages.
Jonathan Jenkins's avatar
Jonathan Jenkins committed
123
124
125
126
127

= Utility models

== local storage model

128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
The local storage model (LSM) is fairly simple in design but is sufficient for
many simulations with reasonable I/O access patterns. It is an
overhead/latency/bandwidth model that tracks file and offset accesses to
determine whether to apply seeking penalties to the performance of the access.
It uses a simple histogram-based approach to parameterization:
overhead/latency/bandwidth numbers are given relative to different access
sizes. To gather such parameters, well-known I/O benchmarks such as fio
(http://git.kernel.dk/?p=fio.git;a=summary) can be used.

The LP name used in configuration is "lsm" and the configuration is expected to
be in a similarly named standalone group, an example of which is shown below:

lsm
{
    # in bytes
    request_sizes   = ( "4096","8192","16384","32768","65536","131072","262144","524288","1048576","2097152","4194304" );
    # in MiB/s (2^20 bytes / s)
    write_rates     = ( "1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7","1511.7" );
    read_rates      = ( "1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1","1542.1" );
    # in microseconds
    write_seeks     = ( "499.5","509.0","514.7","525.9","546.4","588.3","663.1","621.8","539.1","3179.5","6108.8" );
    read_seeks      = ( "3475.6","3470.0","3486.2","3531.2","3608.6","3741.0","3988.9","4530.2","5644.2","7922.0","11700.3" );
    write_overheads = ( "29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67","29.67" );
    read_overheads  = ( "23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67","23.67" );
}

The API can be found at codes/local-storage-model.h and example usage can be
seen in tests/local-storage-model-test.c and tests/conf/lsm-test.conf. 

Jonathan Jenkins's avatar
Jonathan Jenkins committed
157
== resource model
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175

The resource model presents a simple integer counter representing some finite
resource (e.g., bytes of memory available). LPs request some number of units of
the resource, receiving a success/failure completion message via a callback
mechanism. Optional "blocking" can be used to defer the completion message
until the request is successfully completed.

The configuration LP name is "resource" and the parameters are given in a
similarly-named group. An example is shown below:

resource
{
    available="8192";
}

The API for the underlying resource data structure can be found in
codes/resource.h. The user-facing API for communicating with the LP can be
found in codes/resource-lp.h.