RELEASE_NOTES 7.98 KB
Newer Older
1 2
0.5.2 (July 13, 2016)

Jonathan Jenkins's avatar
Jonathan Jenkins committed
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Summer of CODES was another huge success! This release was created during the
  workshop.

general:
==========

the ROSS "commit" function was added to the CODES models from the latest
  version of ROSS (d3bdc07)

the pkg-config program is properly checked for, resulting in an error if it
  is not found.

the "configfile" API has been promoted to a public API, allowing more flexible
  management of configuration entities. It is currently undocumented.


networks:
==========

remove overly-restrictive assert from dragonfly

workloads:
==========

enabled max wait time stat in MPI replay

print an error message if pending waits never processed in MPI replay

utilities:
==========

the mapping context API gained a new mapping method, which uses the ratio of
  mapped-from to mapped-to entities to map contiguous mapped-from IDs.
36

37 38 39 40 41 42 43 44 45 46 47 48 49 50
0.5.1 (June 09, 2016)

network:
==========

corrected link latency calculation in dragonfly model

printf argument mismatch in dragonfly model

refactors to the torus bandwidth calculation to mirror that of the other
  networks (no functional change)

more robust type conversions in dragonfly (int sizes -> uint64_t)

51 52 53
removed the redundant and obsolete MPI replay simulator
  (modelnet-mpi-wrklds.c). The proper version to use is modelnet-mpi-replay.c

54 55 56
----------

0.5.0 (May 24, 2016)
57 58 59 60 61 62 63

general:
==========

codes-base and codes-net have been combined into a single project (now at
https://xgitlab.cels.anl.gov/codes/codes).

64
updated to ROSS revision d9cef53.
65 66 67 68 69 70 71 72

fixed a large number of warnings across the codebase.

networks:
==========

addition of the SlimFly network topology, corresponding to the Wolfe et al.
  paper "Modeling a Million-node Slim Fly Network using Parallel Discrete-event
73
  Simulation", at SIGSIM-PADS'16. See README.slimfly.txt
74
  (src/networks/model-net/doc).
75 76 77

modelnet now supports sampling at regular intervals. Dragonfly LPs can
  currently make use of this - others can be added based on demand. See
78
  (src/network-workloads/README_synthetic.txt).
79

80 81 82
dragonfly and torus network models credit-based flow control has been updated.
    Dragonfly model's adaptive routing algorithms have been updated. For details,
    see paper "Enabling Parallel Simulation of Large-Scale HPC Network Systems",
83
    M.Mubarak et al., at IEEE Trans. on Parallel and Distributed Systems (TPDS).
84

85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
allow 0-byte messages in model-net.

enable "local" model-net messages (for LPs sharing the same model-net endpoint -
  approximates a zero-copy ownership pass of the payload)

the model_net_event family of functions now return a token value which must
  be passed to model_net_event_rc2.

workloads:
==========

concurrent workload support added to workload generators and the MPI
  simulation layer. See codes-jobmap.h and the modified codes-workload.h.
  Thanks to Xu Yang for the partial contributions. Not all workload generators
  support concurrent workloads at this time.

scripts for generating job allocations specific to the torus and dragonfly
  topologies. See scripts/allocation_gen. Thanks to Xu Yang for the
  contributions.

multiple fixes to the MPI simulation layer and DUMPI workload generator.

concurrent workload support and more flexible rank mappings in the MPI
  simulation layer. Thanks to Xu Yang for the initial code.

removed scalatrace workload generator, which never made it to a usable state.

a new checkpoint IO workload generator has been added, based on the Daly paper
  "A higher order estimate of the optimum checkpoint interval for restart
114 115
  dumps" at Future Generation Computing Systems 2004. See README.codes-workload
  (src/workload).
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145

utilities:
==========

fixes to rc-stack - memory no longer leaks in sequential mode, and optimistic
  debug mode is now supported.

added a more performance-sensitive function, codes_mapping_get_lp_info2, to
  codes-mapping (passes const pointers around instead of copying strings)

added a "mapping context" API for better controlling implicit LP-LP mappings,
  including modelnet, local storage model, and resource. See
  codes/codes-mapping-context.h and added functions in the mentioned LPs for
  details.

formalized a callback mechanism for CODES, replacing previous ad-hoc
  methods of passing control from LPs to arbitrary other LPs. The request and
  local storage model LP APIs have been changed to use this mechanism, and
  model-net has additional APIs to use user-provided mapping contexts. See
  codes/codes-callback.h for the API and tests/resource-test.c for advanced
  usage.

deprecations:
==========

model_net_event_rc (use model_net_event_rc2, which will eventually be renamed
  to model_net_event_rc)

codes_event_new (define your own bounds-checking macro if need be)

146
----------
147

148 149 150 151 152 153 154 155 156 157
0.4.1 (September 30, 2015)

general:
==========

fix compatibility with recent ROSS releases

----------

0.4.0 (May 6, 2015)
158

159 160
codes-base

161 162
general:
==========
163

164
significant source reorganization / refactoring
165

166
refactor some private headers out of the public eye
167

168 169 170 171
dead code removal

documentation:
==========
172

173
improved example_heterogeneous example program
174

Jonathan Jenkins's avatar
Jonathan Jenkins committed
175
added configuration to example_heterogeneous showing two torus networks
176

177 178
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
179

180 181 182 183
additions to best practice document

configurator:
==========
184

185
more stable file format for configurator output
186

187
ignore unrelated parameters passed into filter_configs
188

189 190 191 192
handle empty cfields in configurator

workloads:
==========
193

194
combined network and IO workload APIs into a single one
195

196
adding dumpi workload support in codes-workload-dump utility
197

198
workload dump utility option cleanup
199

200
renamed "bgp" workload generator to "iolang", significant cleanups
201

202
put network workload ops in workload dump util
203

204
removing one of the dumpi libraries from the build. It was generating some unwanted dumpi files.
205

206 207 208 209
network workload API more fleshed out

utilities:
==========
210

Jonathan Jenkins's avatar
Jonathan Jenkins committed
211
configuration bug fixes for larger LP type counts
212

213
resource LP annotation mapping hooks
214

215
local storage model API switch to use annotations
216

217
better configuration error handling
218

219
hedge against precision loss in codes_local_latency (see codes.h)
220

221 222 223
use a different RNG than default for codes_local_latency
- prevents addition/removal of codes_local_latency calls from poisoning RNG
  stream of calling model
224

225 226
added simple GVT-aware stack with garbage collection (see rc-stack.h)

227 228
codes-net

229 230
general:
==========
231

232
cleanup of much of the code base
233

234
more informative error for failure to find modelnet lps
235

Jonathan Jenkins's avatar
Jonathan Jenkins committed
236
removed redundant include directory on install (was 'install/codes/codes/*.h')
237 238 239

documentation:
==========
240

241 242
reorganized files to prevent name collisions on OSX. Top-level docs other than
  copyright now in doc directory
243

244
updated code documentation
245

246
fix linker error in certain cases with codes-base
247

248 249 250 251 252
tweaked config error handling


networks:
==========
253
fix to loggp latency calculation when using "receive queue"
254

255
made torus lps agnostic to groups and aware of annotations
256

Jonathan Jenkins's avatar
Jonathan Jenkins committed
257
miscellaneous fixes to dragonfly model
258

259 260
updates to simplep2p: support for having different latency/bw at sender &
  receiver end. See src/models/networks/model-net/doc/README.simplep2p.txt
261

262
minor fixes to usage of quickhash in replay tool
263

264
fixed RNG reverse computation bug in loggp
265

266 267 268 269
fixed swapped arguments in round-robin scheduler causing short circuit

workloads:
==========
270

271
minor changes to dumpi trace config files
272

273
resolving minor bug with reverse computation in dumpi traces
274

275
Updating network trace code to use the combined workload API
276

277
Adding synthetic traffic patterns (currently with dragonfly model)
278

279
Adding network workload test program for debugging
280

281
Updating MPI wait/wait_all code in replay tool
282

283 284 285
----------

0.3.0 (November 7, 2014)
286

287 288
codes-base

289 290
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation.
291 292 293

codes-net

Jonathan Jenkins's avatar
Jonathan Jenkins committed
294 295 296 297
Initial "official" release. Against previous repository revisions, this release
includes more complete documentation and a rename of the "simplewan" model to
the "simplep2p" (simple point-to-point) model to more accurately represent
what it's modeling.