RELEASE_NOTES 11.8 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1.0.0 (July 12, 2018)

general:
=======
Adding support for dragonfly-plus network model. Multiple forms of routing
(progressive adaptive, minimal, non-minimal-spine and leaf) have been
implemented.

https://xgitlab.cels.anl.gov/codes/codes/wikis/dragonfly-plus

Adding support for express mesh network model, which can be configured as
hyperX.

Adding support for Multi-plane/rail in fat-tree via multiple single port NICs
per compute node or one multi-port NIC per node.

Adding a generic template for building new network models. For simplest case,
only 2 functions and premable changes should suffice to add a new network.
19
Updated Express Mesh network model to serve as an example. For details, see
20 21 22 23 24 25 26 27 28 29 30

Darshan workload generator has been updated to use Darshan version 3.x.

Network models updated to capture simulation statistics over virtual time using
ROSS/CODES instrumentation. For details, see:

https://xgitlab.cels.anl.gov/codes/codes/wikis/Using-ROSS-Instrumentation-with-CODES

Compatible with ROSS version that enables statistics collection of simulation
performance. For details see:

31
http://ross-org.github.io/instrumentation/instrumentation.html
32 33 34

Online workload replay functionality has been added that allows SWM workloads
to be simulated insitu on the network models. WIP to integrate Conceptual
35
domain specific language for network communication.
36 37 38 39 40 41 42 43

Multiple traffic patterns were added in the background traffic generation
including stencil, all-to-all and random permutation.

Performance tuning enabled for optimistic mode. For details, see:

https://xgitlab.cels.anl.gov/codes/codes/wikis/Optimistic-Performance-Tuning-Tips

44 45 46 47 48 49 50 51
0.6.0 (July 03, 2017)

general:
========

C++ models can now be built and integrated with CODES. The new dragonfly model
has been implemented in C++.

52 53
CODES can now replay collective operations by using CoRTex -- a library for
translating collectives to point to point operations. For details see:
54 55 56 57 58 59 60 61 62 63 64 65

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-cortex-workload

CODES wiki with details on model development, networking, MPI simulation and
more has been added at:

https://xgitlab.cels.anl.gov/codes/codes/wikis/home

Test suite has been extended -- tests for DUMPI trace replay have been added.

Unused variable warnings have been fixed.

Misbah Mubarak's avatar
Misbah Mubarak committed
66 67 68
Compatible with the most recent ROSS version that has visualization related
changes.

69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
networks:
==========

dragonfly network model based on Cray XC topology has been added. The model can
use the network configurations of Theta and Edison systems. Custom network
configurations can be generated using C scripts. For details see:

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-dragonfly

fat tree network model with support for adaptive and static routing has been
added. The model can support both full and pruned fat tree configurations. For
details, refer to the wiki:

https://xgitlab.cels.anl.gov/codes/codes/wikis/codes-fattree

memory consumption has been reduced by doing lazy allocation of hash memory.

MPI trace replay:
==============

MPI rendezvous protocol can now be replayed in addition to the eager protocol.
The transition point for switching between the two protocols is configurable.

Background network communication using uniform random workload can now be
generated. The traffic generation gets automatically shut off when the main workload
finishes.

96
Collectives can now be translated into point to point using the CoRTex library.
97 98 99 100 101 102 103

Performance of MPI_AllReduce is reported when debug_cols option is enabled.

Aggregate and average performance of different message size is reported when
message tracking is enabled (this option is for debugging, works in sequential
mode only).

Misbah Mubarak's avatar
Misbah Mubarak committed
104 105 106 107 108 109 110
Work in progress:
=============
Integration of the express mesh model and multi-rail/multi-plane fat tree
model.

ROSS-vis instrumentation-- recording model level statistics in binary format
and translating them into text.
111

112 113
0.5.2 (July 13, 2016)

Jonathan Jenkins's avatar
Jonathan Jenkins committed
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
Summer of CODES was another huge success! This release was created during the
  workshop.

general:
==========

the ROSS "commit" function was added to the CODES models from the latest
  version of ROSS (d3bdc07)

the pkg-config program is properly checked for, resulting in an error if it
  is not found.

the "configfile" API has been promoted to a public API, allowing more flexible
  management of configuration entities. It is currently undocumented.


networks:
==========

remove overly-restrictive assert from dragonfly

workloads:
==========

enabled max wait time stat in MPI replay

print an error message if pending waits never processed in MPI replay

utilities:
==========

the mapping context API gained a new mapping method, which uses the ratio of
  mapped-from to mapped-to entities to map contiguous mapped-from IDs.
147

148 149 150 151 152 153 154 155 156 157 158 159 160 161
0.5.1 (June 09, 2016)

network:
==========

corrected link latency calculation in dragonfly model

printf argument mismatch in dragonfly model

refactors to the torus bandwidth calculation to mirror that of the other
  networks (no functional change)

more robust type conversions in dragonfly (int sizes -> uint64_t)

162 163 164
removed the redundant and obsolete MPI replay simulator
  (modelnet-mpi-wrklds.c). The proper version to use is modelnet-mpi-replay.c

165 166 167
----------

0.5.0 (May 24, 2016)
168 169 170 171 172 173 174

general:
==========

codes-base and codes-net have been combined into a single project (now at
https://xgitlab.cels.anl.gov/codes/codes).

175
updated to ROSS revision d9cef53.
176 177 178 179 180 181 182 183

fixed a large number of warnings across the codebase.

networks:
==========

addition of the SlimFly network topology, corresponding to the Wolfe et al.
  paper "Modeling a Million-node Slim Fly Network using Parallel Discrete-event
184
  Simulation", at SIGSIM-PADS'16. See README.slimfly.txt
185
  (src/networks/model-net/doc).
186 187 188

modelnet now supports sampling at regular intervals. Dragonfly LPs can
  currently make use of this - others can be added based on demand. See
189
  (src/network-workloads/README_synthetic.txt).
190

191 192 193
dragonfly and torus network models credit-based flow control has been updated.
    Dragonfly model's adaptive routing algorithms have been updated. For details,
    see paper "Enabling Parallel Simulation of Large-Scale HPC Network Systems",
194
    M.Mubarak et al., at IEEE Trans. on Parallel and Distributed Systems (TPDS).
195

196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224
allow 0-byte messages in model-net.

enable "local" model-net messages (for LPs sharing the same model-net endpoint -
  approximates a zero-copy ownership pass of the payload)

the model_net_event family of functions now return a token value which must
  be passed to model_net_event_rc2.

workloads:
==========

concurrent workload support added to workload generators and the MPI
  simulation layer. See codes-jobmap.h and the modified codes-workload.h.
  Thanks to Xu Yang for the partial contributions. Not all workload generators
  support concurrent workloads at this time.

scripts for generating job allocations specific to the torus and dragonfly
  topologies. See scripts/allocation_gen. Thanks to Xu Yang for the
  contributions.

multiple fixes to the MPI simulation layer and DUMPI workload generator.

concurrent workload support and more flexible rank mappings in the MPI
  simulation layer. Thanks to Xu Yang for the initial code.

removed scalatrace workload generator, which never made it to a usable state.

a new checkpoint IO workload generator has been added, based on the Daly paper
  "A higher order estimate of the optimum checkpoint interval for restart
225 226
  dumps" at Future Generation Computing Systems 2004. See README.codes-workload
  (src/workload).
227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256

utilities:
==========

fixes to rc-stack - memory no longer leaks in sequential mode, and optimistic
  debug mode is now supported.

added a more performance-sensitive function, codes_mapping_get_lp_info2, to
  codes-mapping (passes const pointers around instead of copying strings)

added a "mapping context" API for better controlling implicit LP-LP mappings,
  including modelnet, local storage model, and resource. See
  codes/codes-mapping-context.h and added functions in the mentioned LPs for
  details.

formalized a callback mechanism for CODES, replacing previous ad-hoc
  methods of passing control from LPs to arbitrary other LPs. The request and
  local storage model LP APIs have been changed to use this mechanism, and
  model-net has additional APIs to use user-provided mapping contexts. See
  codes/codes-callback.h for the API and tests/resource-test.c for advanced
  usage.

deprecations:
==========

model_net_event_rc (use model_net_event_rc2, which will eventually be renamed
  to model_net_event_rc)

codes_event_new (define your own bounds-checking macro if need be)

257
----------
258

259 260 261 262 263 264 265 266 267 268
0.4.1 (September 30, 2015)

general:
==========

fix compatibility with recent ROSS releases

----------

0.4.0 (May 6, 2015)
269

270 271
codes-base

272 273
general:
==========
274

275
significant source reorganization / refactoring
276

277
refactor some private headers out of the public eye
278

279 280 281 282
dead code removal

documentation:
==========
283

284
improved example_heterogeneous example program
285

Jonathan Jenkins's avatar
Jonathan Jenkins committed
286
added configuration to example_heterogeneous showing two torus networks
287

288 289
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
290

291 292 293 294
additions to best practice document

configurator:
==========
295

296
more stable file format for configurator output
297

298
ignore unrelated parameters passed into filter_configs
299

300 301 302 303
handle empty cfields in configurator

workloads:
==========
304

305
combined network and IO workload APIs into a single one
306

307
adding dumpi workload support in codes-workload-dump utility
308

309
workload dump utility option cleanup
310

311
renamed "bgp" workload generator to "iolang", significant cleanups
312

313
put network workload ops in workload dump util
314

315
removing one of the dumpi libraries from the build. It was generating some unwanted dumpi files.
316

317 318 319 320
network workload API more fleshed out

utilities:
==========
321

Jonathan Jenkins's avatar
Jonathan Jenkins committed
322
configuration bug fixes for larger LP type counts
323

324
resource LP annotation mapping hooks
325

326
local storage model API switch to use annotations
327

328
better configuration error handling
329

330
hedge against precision loss in codes_local_latency (see codes.h)
331

332 333 334
use a different RNG than default for codes_local_latency
- prevents addition/removal of codes_local_latency calls from poisoning RNG
  stream of calling model
335

336 337
added simple GVT-aware stack with garbage collection (see rc-stack.h)

338 339
codes-net

340 341
general:
==========
342

343
cleanup of much of the code base
344

345
more informative error for failure to find modelnet lps
346

Jonathan Jenkins's avatar
Jonathan Jenkins committed
347
removed redundant include directory on install (was 'install/codes/codes/*.h')
348 349 350

documentation:
==========
351

352 353
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
354

355
updated code documentation
356

357
fix linker error in certain cases with codes-base
358

359 360 361 362 363
tweaked config error handling


networks:
==========
364
fix to loggp latency calculation when using "receive queue"
365

366
made torus lps agnostic to groups and aware of annotations
367

Jonathan Jenkins's avatar
Jonathan Jenkins committed
368
miscellaneous fixes to dragonfly model
369

370 371
updates to simplep2p: support for having different latency/bw at sender &
  receiver end. See src/models/networks/model-net/doc/README.simplep2p.txt
372

373
minor fixes to usage of quickhash in replay tool
374

375
fixed RNG reverse computation bug in loggp
376

377 378 379 380
fixed swapped arguments in round-robin scheduler causing short circuit

workloads:
==========
381

382
minor changes to dumpi trace config files
383

384
resolving minor bug with reverse computation in dumpi traces
385

386
Updating network trace code to use the combined workload API
387

388
Adding synthetic traffic patterns (currently with dragonfly model)
389

390
Adding network workload test program for debugging
391

392
Updating MPI wait/wait_all code in replay tool
393

394 395 396
----------

0.3.0 (November 7, 2014)
397

398 399
codes-base

400 401
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation.
402 403 404

codes-net

Jonathan Jenkins's avatar
Jonathan Jenkins committed
405 406 407 408
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation and a rename of the "simplewan" model to
the "simplep2p" (simple point-to-point) model to more accurately represent
what it's modeling.