RELEASE_NOTES 10.2 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
0.6.0 (July 03, 2017)

general:
========

C++ models can now be built and integrated with CODES. The new dragonfly model
has been implemented in C++.

support has been added for building CODES with CoRTex -- a library for translating collectives
to point to point operations. For details see:

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-cortex-workload

CODES wiki with details on model development, networking, MPI simulation and
more has been added at:

https://xgitlab.cels.anl.gov/codes/codes/wikis/home

Test suite has been extended -- tests for DUMPI trace replay have been added.

Unused variable warnings have been fixed.

Misbah Mubarak's avatar
Misbah Mubarak committed
23
24
25
Compatible with the most recent ROSS version that has visualization related
changes.

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
networks:
==========

dragonfly network model based on Cray XC topology has been added. The model can
use the network configurations of Theta and Edison systems. Custom network
configurations can be generated using C scripts. For details see:

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-dragonfly

fat tree network model with support for adaptive and static routing has been
added. The model can support both full and pruned fat tree configurations. For
details, refer to the wiki:

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-fattree

memory consumption has been reduced by doing lazy allocation of hash memory.

MPI trace replay:
==============

MPI rendezvous protocol can now be replayed in addition to the eager protocol.
The transition point for switching between the two protocols is configurable.

Background network communication using uniform random workload can now be
generated. The traffic generation gets automatically shut off when the main workload
finishes.

Collectives can now be translated into point to point using the CoRTex library. 

Performance of MPI_AllReduce is reported when debug_cols option is enabled.

Aggregate and average performance of different message size is reported when
message tracking is enabled (this option is for debugging, works in sequential
mode only).

Misbah Mubarak's avatar
Misbah Mubarak committed
61
62
63
64
65
66
67
Work in progress:
=============
Integration of the express mesh model and multi-rail/multi-plane fat tree
model.

ROSS-vis instrumentation-- recording model level statistics in binary format
and translating them into text.
68

69
70
Running ensemble simulations with Switft.

71
72
0.5.2 (July 13, 2016)

Jonathan Jenkins's avatar
Jonathan Jenkins committed
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
Summer of CODES was another huge success! This release was created during the
  workshop.

general:
==========

the ROSS "commit" function was added to the CODES models from the latest
  version of ROSS (d3bdc07)

the pkg-config program is properly checked for, resulting in an error if it
  is not found.

the "configfile" API has been promoted to a public API, allowing more flexible
  management of configuration entities. It is currently undocumented.


networks:
==========

remove overly-restrictive assert from dragonfly

workloads:
==========

enabled max wait time stat in MPI replay

print an error message if pending waits never processed in MPI replay

utilities:
==========

the mapping context API gained a new mapping method, which uses the ratio of
  mapped-from to mapped-to entities to map contiguous mapped-from IDs.
106

107
108
109
110
111
112
113
114
115
116
117
118
119
120
0.5.1 (June 09, 2016)

network:
==========

corrected link latency calculation in dragonfly model

printf argument mismatch in dragonfly model

refactors to the torus bandwidth calculation to mirror that of the other
  networks (no functional change)

more robust type conversions in dragonfly (int sizes -> uint64_t)

Jonathan Jenkins's avatar
Jonathan Jenkins committed
121
122
123
removed the redundant and obsolete MPI replay simulator
  (modelnet-mpi-wrklds.c). The proper version to use is modelnet-mpi-replay.c

124
125
126
----------

0.5.0 (May 24, 2016)
127
128
129
130
131
132
133

general:
==========

codes-base and codes-net have been combined into a single project (now at
https://xgitlab.cels.anl.gov/codes/codes).

Jonathan Jenkins's avatar
Jonathan Jenkins committed
134
updated to ROSS revision d9cef53.
135
136
137
138
139
140
141
142

fixed a large number of warnings across the codebase.

networks:
==========

addition of the SlimFly network topology, corresponding to the Wolfe et al.
  paper "Modeling a Million-node Slim Fly Network using Parallel Discrete-event
143
  Simulation", at SIGSIM-PADS'16. See README.slimfly.txt
144
  (src/networks/model-net/doc).
145
146
147

modelnet now supports sampling at regular intervals. Dragonfly LPs can
  currently make use of this - others can be added based on demand. See
148
  (src/network-workloads/README_synthetic.txt).
149

150
151
152
dragonfly and torus network models credit-based flow control has been updated.
    Dragonfly model's adaptive routing algorithms have been updated. For details,
    see paper "Enabling Parallel Simulation of Large-Scale HPC Network Systems",
153
    M.Mubarak et al., at IEEE Trans. on Parallel and Distributed Systems (TPDS).
154

155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
allow 0-byte messages in model-net.

enable "local" model-net messages (for LPs sharing the same model-net endpoint -
  approximates a zero-copy ownership pass of the payload)

the model_net_event family of functions now return a token value which must
  be passed to model_net_event_rc2.

workloads:
==========

concurrent workload support added to workload generators and the MPI
  simulation layer. See codes-jobmap.h and the modified codes-workload.h.
  Thanks to Xu Yang for the partial contributions. Not all workload generators
  support concurrent workloads at this time.

scripts for generating job allocations specific to the torus and dragonfly
  topologies. See scripts/allocation_gen. Thanks to Xu Yang for the
  contributions.

multiple fixes to the MPI simulation layer and DUMPI workload generator.

concurrent workload support and more flexible rank mappings in the MPI
  simulation layer. Thanks to Xu Yang for the initial code.

removed scalatrace workload generator, which never made it to a usable state.

a new checkpoint IO workload generator has been added, based on the Daly paper
  "A higher order estimate of the optimum checkpoint interval for restart
184
185
  dumps" at Future Generation Computing Systems 2004. See README.codes-workload
  (src/workload).
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215

utilities:
==========

fixes to rc-stack - memory no longer leaks in sequential mode, and optimistic
  debug mode is now supported.

added a more performance-sensitive function, codes_mapping_get_lp_info2, to
  codes-mapping (passes const pointers around instead of copying strings)

added a "mapping context" API for better controlling implicit LP-LP mappings,
  including modelnet, local storage model, and resource. See
  codes/codes-mapping-context.h and added functions in the mentioned LPs for
  details.

formalized a callback mechanism for CODES, replacing previous ad-hoc
  methods of passing control from LPs to arbitrary other LPs. The request and
  local storage model LP APIs have been changed to use this mechanism, and
  model-net has additional APIs to use user-provided mapping contexts. See
  codes/codes-callback.h for the API and tests/resource-test.c for advanced
  usage.

deprecations:
==========

model_net_event_rc (use model_net_event_rc2, which will eventually be renamed
  to model_net_event_rc)

codes_event_new (define your own bounds-checking macro if need be)

216
----------
217

218
219
220
221
222
223
224
225
226
227
0.4.1 (September 30, 2015)

general:
==========

fix compatibility with recent ROSS releases

----------

0.4.0 (May 6, 2015)
228

229
230
codes-base

231
232
general:
==========
233

234
significant source reorganization / refactoring
235

236
refactor some private headers out of the public eye
237

238
239
240
241
dead code removal

documentation:
==========
242

243
improved example_heterogeneous example program
244

Jonathan Jenkins's avatar
Jonathan Jenkins committed
245
added configuration to example_heterogeneous showing two torus networks
246

247
248
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
249

250
251
252
253
additions to best practice document

configurator:
==========
254

255
more stable file format for configurator output
256

257
ignore unrelated parameters passed into filter_configs
258

259
260
261
262
handle empty cfields in configurator

workloads:
==========
263

264
combined network and IO workload APIs into a single one
265

266
adding dumpi workload support in codes-workload-dump utility
267

268
workload dump utility option cleanup
269

270
renamed "bgp" workload generator to "iolang", significant cleanups
271

272
put network workload ops in workload dump util
273

274
removing one of the dumpi libraries from the build. It was generating some unwanted dumpi files.
275

276
277
278
279
network workload API more fleshed out

utilities:
==========
280

Jonathan Jenkins's avatar
Jonathan Jenkins committed
281
configuration bug fixes for larger LP type counts
282

283
resource LP annotation mapping hooks
284

285
local storage model API switch to use annotations
286

287
better configuration error handling
288

289
hedge against precision loss in codes_local_latency (see codes.h)
290

291
292
293
use a different RNG than default for codes_local_latency
- prevents addition/removal of codes_local_latency calls from poisoning RNG
  stream of calling model
294

295
296
added simple GVT-aware stack with garbage collection (see rc-stack.h)

297
298
codes-net

Jonathan Jenkins's avatar
Jonathan Jenkins committed
299
300
general:
==========
301

Jonathan Jenkins's avatar
Jonathan Jenkins committed
302
cleanup of much of the code base
303

Jonathan Jenkins's avatar
Jonathan Jenkins committed
304
more informative error for failure to find modelnet lps
305

Jonathan Jenkins's avatar
Jonathan Jenkins committed
306
removed redundant include directory on install (was 'install/codes/codes/*.h')
Jonathan Jenkins's avatar
Jonathan Jenkins committed
307
308
309

documentation:
==========
310

Jonathan Jenkins's avatar
Jonathan Jenkins committed
311
312
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
313

Jonathan Jenkins's avatar
Jonathan Jenkins committed
314
updated code documentation
315

Jonathan Jenkins's avatar
Jonathan Jenkins committed
316
fix linker error in certain cases with codes-base
317

Jonathan Jenkins's avatar
Jonathan Jenkins committed
318
319
320
321
322
tweaked config error handling


networks:
==========
323
fix to loggp latency calculation when using "receive queue"
324

Jonathan Jenkins's avatar
Jonathan Jenkins committed
325
made torus lps agnostic to groups and aware of annotations
326

Jonathan Jenkins's avatar
Jonathan Jenkins committed
327
miscellaneous fixes to dragonfly model
328

Jonathan Jenkins's avatar
Jonathan Jenkins committed
329
330
updates to simplep2p: support for having different latency/bw at sender &
  receiver end. See src/models/networks/model-net/doc/README.simplep2p.txt
331

Jonathan Jenkins's avatar
Jonathan Jenkins committed
332
minor fixes to usage of quickhash in replay tool
333

Jonathan Jenkins's avatar
Jonathan Jenkins committed
334
fixed RNG reverse computation bug in loggp
335

Jonathan Jenkins's avatar
Jonathan Jenkins committed
336
337
338
339
fixed swapped arguments in round-robin scheduler causing short circuit

workloads:
==========
340

Jonathan Jenkins's avatar
Jonathan Jenkins committed
341
minor changes to dumpi trace config files
342

Jonathan Jenkins's avatar
Jonathan Jenkins committed
343
resolving minor bug with reverse computation in dumpi traces
344

Jonathan Jenkins's avatar
Jonathan Jenkins committed
345
Updating network trace code to use the combined workload API
346

Jonathan Jenkins's avatar
Jonathan Jenkins committed
347
Adding synthetic traffic patterns (currently with dragonfly model)
348

Jonathan Jenkins's avatar
Jonathan Jenkins committed
349
Adding network workload test program for debugging
350

Jonathan Jenkins's avatar
Jonathan Jenkins committed
351
Updating MPI wait/wait_all code in replay tool
352

353
354
355
----------

0.3.0 (November 7, 2014)
356

357
358
codes-base

359
360
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation.
361
362
363

codes-net

Jonathan Jenkins's avatar
Jonathan Jenkins committed
364
365
366
367
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation and a rename of the "simplewan" model to
the "simplep2p" (simple point-to-point) model to more accurately represent
what it's modeling.