codes-best-practices.tex 26.2 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

% Add the compsoc option for Computer Society conferences.
% If IEEEtran.cls has not been installed into the LaTeX system files,
% manually specify the path to it like:
% \documentclass[conference]{../sty/IEEEtran}

% cite.sty was written by Donald Arseneau
% V1.6 and later of IEEEtran pre-defines the format of the cite.sty package
% \cite{} output to follow that of IEEE. Loading the cite package will
% result in citation numbers being automatically sorted and properly
% "compressed/ranged". e.g., [1], [9], [2], [7], [5], [6] without using
% cite.sty will become [1], [2], [5]--[7], [9] using cite.sty. cite.sty's
% \cite will automatically add leading space, if needed. Use cite.sty's
% noadjust option (cite.sty V3.8 and later) if you want to turn this off.
% cite.sty is already installed on most LaTeX systems. Be sure and use
% version 4.0 (2003-05-27) and later if using hyperref.sty. cite.sty does
% not currently provide for hyperlinked citations.
% The latest version can be obtained at:
% The documentation is contained in the cite.sty file itself.
32 33
Jonathan Jenkins's avatar
Jonathan Jenkins committed
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

\lstset{ %

52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152

% declare the path(s) where your graphic files are
% and their extensions so you won't have to specify these with
% every instance of \includegraphics

 % \usepackage[pdftex]{graphicx}
 % declare the path(s) where your graphic files are
 % \graphicspath{{../pdf/}{../jpeg/}}
 % and their extensions so you won't have to specify these with
 % every instance of \includegraphics
 % \DeclareGraphicsExtensions{.pdf,.jpeg,.png}
 % or other class option (dvipsone, dvipdf, if not using dvips). graphicx
 % will default to the driver specified in the system graphics.cfg if no
 % driver is specified.
 % \usepackage[dvips]{graphicx}
 % declare the path(s) where your graphic files are
 % \graphicspath{{../eps/}}
 % and their extensions so you won't have to specify these with
 % every instance of \includegraphics
 % \DeclareGraphicsExtensions{.eps}

% Frank Mittelbach's and David Carlisle's array.sty patches and improves
% the standard LaTeX2e array and tabular environments to provide better
% appearance and additional user controls. As the default LaTeX2e table
% generation code is lacking to the point of almost being broken with
% respect to the quality of the end results, all users are strongly
% advised to use an enhanced (at the very least that provided by array.sty)
% set of table tools. array.sty is already installed on most systems. The
% latest version and documentation can be obtained at:

% Also of notable interest is Scott Pakin's eqparbox package for creating
% (automatically sized) equal width boxes - aka "natural width parboxes".
% Available at:


% \usepackage{subfigure}
% subfigure.sty was written by Steven Douglas Cochran. This package makes it
% easy to put subfigures in your figures. e.g., "Figure 1a and 1b". For IEEE
% work, it is a good idea to load it with the tight package option to reduce
% the amount of white space around the subfigures. subfigure.sty is already
% installed on most LaTeX systems. The latest version and documentation can
% be obtained at:
% subfigure.sty has been superceeded by subfig.sty.

% subfig.sty, also written by Steven Douglas Cochran, is the modern
% replacement for subfigure.sty. However, subfig.sty requires and
% automatically loads Axel Sommerfeldt's caption.sty which will override
% IEEEtran.cls handling of captions and this will result in nonIEEE style
% figure/table captions. To prevent this problem, be sure and preload
% caption.sty with its "caption=false" package option. This is will preserve
% IEEEtran.cls handing of captions. Version 1.3 (2005/06/28) and later 
% (recommended due to many improvements over 1.2) of subfig.sty supports
% the caption=false option directly:
% The latest version and documentation can be obtained at:
% The latest version and documentation of caption.sty can be obtained at:

% url.sty was written by Donald Arseneau. It provides better support for
% handling and breaking URLs. url.sty is already installed on most LaTeX
% systems. The latest version can be obtained at:
% Read the url.sty source comments for usage information. Basically,
% \url{my_url_here}.

% *** Do not adjust lengths that control margins, column widths, etc. ***
% *** Do not use packages that alter fonts (such as pslatex).         ***
% There should be no need to do such things with IEEEtran.cls V1.6 and later.
% (Unless specifically asked to do so by the journal or conference you plan
% to submit to, of course. )

% correct bad hyphenation here
\hyphenation{op-tical net-works semi-conduc-tor}

153 154 155 156 157

158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177
\title{CODES Best Practices}

%\author{\IEEEauthorblockN{Someone\IEEEauthorrefmark{1}} \\

%\numberofauthors{6} %  in this sample file, there are a *total*

% use for special paper notices
%\IEEEspecialpapernotice{(Invited Paper)}

% use arabic rather than roman numerals for table references

% make the title area

178 179
This document outlines best practices for developing models in the
CODES/ROSS framework.  The reader should already be familiar with ROSS
Philip Carns's avatar
Philip Carns committed
and discrete event simulation in general; those topics are covered in the primary
181 182 183
ROSS documentation. Additionally, the GETTING\_STARTED file presents a better
introduction/overview to CODES - this guide should be consulted after becoming
familiar with CODES/ROSS.
184 185
The main purpose of this document is to help the reader produce
CODES models in a consistent, modular style so that components can be more
187 188
easily shared and reused.  It also includes a few tips to help avoid common
simulation bugs.
189 190 191

For more information, ROSS has a bunch of documentation available in their
repository/wiki - see \url{}.
192 193

\section{CODES: modularizing models}

196 197 198
This section covers some of the basic principles of how to organize model
components to be more modular and easier to reuse across CODES models.

\subsection{Units of time}

ROSS does not dictate the units to be used in simulation timestamps.
Philip Carns's avatar
Philip Carns committed
The \texttt{tw\_stime} type could represent any time unit
(e.g. days, hours, seconds, nanoseconds, etc.).  When building CODES
Philip Carns's avatar
Philip Carns committed
204 205
models you should \emph{always treat timestamps as double precision floating
point numbers representing nanoseconds}, however.
206 207 208
All components within a model must agree on the time units in order to
advance simulation time consistently.  Several common utilities in the
CODES project expect to operate in terms of nanoseconds.

210 211
\subsection{Organizing models by LP types}

212 213 214 215 216 217 218 219 220 221 222
ROSS allows you to use as many different LP types as you would like to
construct your models.  Try to take advantage of this as much as possible by
organizing your simulation so that each component of the system that you are
modeling is implemented within its own LP type.  For example, a storage
system model might use different LPs for hard disks, clients, network
adapters, and servers.  There are multiple reasons for dividing up models
like this:

\item General modularity: makes it easier to pull out particular components
(for example, a disk model) for use in other models.
\item Simplicity: if each LP type is only handling a limited set of
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
events, then the event structure, state structure, and event handler
functions will all be much smaller and easier to understand.
\item Reverse computation: it makes it easier to implement reverse
computation, not only because the code is simpler, but also because you can
implement and test reverse computation per component rather than having to
apply it to an entire model all at once before testing.

It is also important to note that you can divide up models not just by
hardware components, but also by functionality, just as
you would modularize the implementation of a distributed file system.  For
example, a storage daemon might include separate LPs for replication, failure
detection, and reconstruction.  Each of those LPs can share the same network
card and disk resources for accurate modeling of resource usage.  They key
reason for splitting them up is to simplify the model and to encourage

Philip Carns's avatar
Philip Carns committed
241 242 243 244 245 246 247 248 249 250 251 252
One hypothetical downside to splitting up models into multiple LP types is that it likely
means that your model will generate more events than a monolithic model
would have.  Remember that \emph{ROSS is really efficient at generating and
processing events}, though!  It is usually a premature optimization to try to optimize a model by
replacing events with function calls in cases where you know the necessary
data is available on the local MPI process.  Also recall that any information
exchanged via event automatically benefits by shifting burden for
tracking/retaining event data and event ordering into ROSS rather than your
model.  This can help simplify reverse computation by breaking complex
operations into smaller, easier to understand (and reverse) event units with
deterministic ordering.

253 254 255 256 257 258 259 260 261 262 263 264
\subsection{Sharing message representation}

It is often difficult to debug cases where an LP sends a message to the wrong
LP, as the event structures can be completely different. Hence, it greatly aids
debugging to adhere to a common structure in messages. In particular, the
message header struct \texttt{msg\_header} in lp-msg.h should be placed and
used at the top of every LP's event structure, enabling inspection of any kind
of message in the simulation. The ``magic'' number should be unique to each LP
type to delineate what the expected type of the intended LP recipient. It is a
similarly good idea to use unique event type IDs.

\subsection{Providing a sane communication API between LPs}

Philip Carns's avatar
Philip Carns committed
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284
ROSS operates by exchanging events between LPs.  If an LP is sending
an event to another LP of the same type, then in general it can do so
by allocating an event structure (e.g. \texttt{tw\_event\_new()}),
populating the event structure, and transmitting it
(e.g. \texttt{tw\_event\_send()}).  If an LP is sending an event to
another LP of a \emph{different} type, however, then it should use an
explicit API to do so without exposing the other LP's event structure
definition.  Event structures are not a robust API for exchanging data
across different LP types.  If one LP type accesses the event (or state)
structure of another LP type, then it entangles the two components such
that one LP is dependent upon the internal architecture of another LP.
This not only makes it difficult to reuse components, but also makes it
difficult to check for incompatibilities at compile time.  The compiler
has no way to know which fields in a struct must be set before sending
an event.

For these reasons we encourage that a) each LP be implemented in a separate
source file and b) all event structs and state structs
be defined only within those source files.  They should not be exposed in external
285 286
headers.  If the definitions are placed in a header then it makes it
possible for those event and state structs to be used as an ad-hoc interface
Philip Carns's avatar
Philip Carns committed
between LPs of different types.

289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347
\section{Coping with time warp / reverse computation}

Time warp and ROSS's reverse computation mechanism, while vital to providing
scalable simulation performance, also complicates model development and
debugging. This section lists some ways of coping with these kinds of errors,
and with reverse computation in general.

The time warp protocol is susceptible to certain classes of simulation behavior
by which LPs are asked to perform messages that are potentially outside the
scope of the behavior the programmer intended to perform (not including logic
bugs by the programmer). An excellent discussion of this topic is given in the
paper ``The Dark Side of Risk (What your mother never told you about Time
Warp)'' by Nicol and Liu

As a small example, consider two LPs, $A$ and $B$. Say $A$ has sent some
message to $B$ at some logical time $t$, then at $t+1$, $A$ is rolled back and
sends an anti-message to $B$. However, before $B$ can process the anti-message,
it processes the original message sent by $A$ *and* sends a message back to
$A$. Now, $A$ receives a message that resulted from a state that, in $A$'s
view, is undefined/unexpected.

Depending on the event dependencies between LPs, issues such as the example can
easily occur in optimistic simulations. In general, when diagnosing these
issues it is useful to determine the full flow of events coming into an LP and
ordering dependencies between those events. For example, protocols for
performing a distributed storage write may have a number of steps represented
as events. Knowing the order that these events can arrive in the system, and
whether multiple events from e.g.\ different writes are possible can go a
long way in determining the vectors for possible errors.


Self-suspend is a technique for limiting how far down a path of undefined
behavior an LP goes when receiving unexpected/undefined combinations of events.
It is a relatively simple concept that falls into four steps:
    \item Aggressively check events for potential unexpected input, putting the
        LP into \emph{suspend mode} by setting a suspend counter and returning
        (keeping the LP state as it was before the offending event was
        processed). Typically these can intersect with asserts in LP code, but
        it is often unclear whether a model error is a programmer error or a
        result of time warp event ordering.
    \item While the suspend counter is positive, increment upon forward event
        receipt and decrement upon reverse event receipt. Don't process the
        received events.
    \item When the suspend counter returns to zero, start processing forward
        events again as normal.
    \item Report whether an LP is in suspend state at the end of the

The primary benefit of self-suspend is that it prevents arbitrary changing (or
destructing) of state based on unexpected messages, leading to a much more
stable simulation. Additionally, the machinery for self-suspend is easy to
implement -- steps 2--4 are a few lines of code each. Also, LP-IO can be
used upon encountering a suspend condition to give error specifics (the reverse
write would occur in the reverse handler in step 3).

Jonathan Jenkins's avatar
Jonathan Jenkins committed
348 349 350
Note that ROSS now has a self-suspend API -- the hand-implementation may still
be used, but it is preferable to use the ROSS version.

351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384
\subsection{Stash data needed for reverse computation within event structures}

Writing discrete-event simulations will necessarily involve ``destructive''
operations, which, in the view of an optimistic simulation, are operations in
which the information needed for rollback is no longer available. Destructive
operations include:
    \item Re-assignment of a variable (losing the original value)
    \item Most floating point operations. Floating-point math is not
        associative and rounding errors cause issues such as
        $a+b-b \neq a$, which need to be considered when making a simulation
        that involves floating-point math.
    \item \texttt{free}'ing data.

One nice property of events is that the data in an event structure will stick
around until GVT sweeps by and the event is guaranteed to be no longer needed.
Hence, one strategy for rolling back destructive operations is to stash the
original values in the event structures causing the destruction, and restoring
them upon rollback. The primary downside of this is that event structure size
increases, which increases ROSS-related overheads (manipulating event-related
data structures and sending events to other processes).

\subsection{Prefer static to dynamic memory in LP state}

In many cases (such as implementing data structures like queues and stacks), an
LP will want to malloc memory within in an event and free it within another.
This is discouraged for the time being. Once a piece of data is freed, it
cannot be recovered upon rollback later on. If your data structures being
allocated are simple and relatively small, you can put the data to be freed
directly into the event structure then free the original copy, though it will
increase the event structure size for the LP accordingly.

In the future, optimistic-mode-aware free lists may be provided by ROSS that
385 386 387
will mitigate this problem. At the moment, a manual implementation of this is
provided in codes by codes/rc-stack.h (see tests/rc-stack-test.c for a
simple demonstration).
388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417

\subsection{Handling non-trivial event dependencies: queuing example}

In storage system simulations, it will often be the case that clients, servers,
or both issue multiple asynchronous (parallel) operations, performing some
action upon the completion of them. More generally, the problem is: an event
issuance (an ack to the client) is based on the completion of more than one
asynchronous/parallel events (local write on primary server, forwarding write to
replica server). Further complicating the matter for storage simulations, there
can be any number of outstanding requests, each waiting on multiple events. 

In ROSS's sequential and conservative parallel modes, the necessary state can
easily be stored in the LP as a queue of statuses for each set of events,
enqueuing upon asynchronous event issuances and updating/dequeuing upon each
completion. Each LP can assign unique IDs to each queue item and propagate the
IDs through the asynchronous events for lookup purposes. However, in optimistic
mode we may remove an item from the queue and then be forced to re-insert it
during reverse computation.

Naively, one could simply never remove queue items, but of course memory will
quickly be consumed.

An elegant solution to this is to \emph{cache the status state in the event
structure that causes the dequeue}. ROSS's reverse computation semantics ensures
that this event will be reversed before the completion events of any of the
other asynchronous events, allowing us to easily recover the state. Furthermore,
events are garbage-collected as the GVT, reducing memory management complexity.
However, this strategy has the disadvantage of increasing event size

Jonathan Jenkins's avatar
Jonathan Jenkins committed
\section{CODES/ROSS: general tips and tricks}

420 421 422 423 424 425 426 427 428 429 430 431
\subsection{Initializing the model}

There are two conceptual steps to initializing a CODES model - LP registration
in ROSS and configuration via consulting the CODES configuration file. In older
versions of models we wrote, these two steps were together. However, it is
highly suggested to separate these two steps into different functions, with the
registration occurring before the call to \texttt{codes\_mapping\_setup}, and
the configuration occurring after the call. This allows the codes-mapping API
to be used at configuration time, which is often useful when LPs need to know
things like LP counts and doing these in the ROSS LP init function would lead
to unnecessary computation. It is especially useful for configuration schemes
that require knowledge of LP annotations.

433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454
\subsection{LP-IO usage}

LP-IO is a simple and useful optimistic-aware IO utility for optimistic
simulations. Based on our usage, we have the following recommendations for
effective usage of it:
    \item Use the command-line to configure turning IO on and off in its
        entirety, and to specify where the output should be placed. Suggested
            \item \texttt{--lp-io-dir=DIR} -- use DIR as the output directory -
                absence of the option indicates no LP-IO output.
            \item \texttt{--lp-io-use-suffix=DUMMY} -- add the PID of the root
                rank to the directory name to avoid clashes between
                multiple runs. If not specified, then the DIR option
                will be exactly used, possibly leading to an error/exit. The
                dummy argument is due to a ROSS limitation of not allowing
                ``flag''-style options (options with no arguments).
    \item Use LP-specific options in the CODES configuration file to drive
        specific options for output within the LP.

Jonathan Jenkins's avatar
Jonathan Jenkins committed
\subsection{Avoiding event timestamp ties}

Jonathan Jenkins's avatar
Jonathan Jenkins committed
458 459 460 461 462 463
Event timestamp ties in ROSS occur when two or more events have the same
timestamp. These have a variety of unintended consequences, most significant of
which is hampering both reproducability and determinism in simulations. To
avoid this, use codes\_local\_latency for events with small or zero time deltas
to add some random noise. codes\_local\_latency must be reversed, so use
codes\_local\_latency\_reverse in reverse event handlers.

Jonathan Jenkins's avatar
Jonathan Jenkins committed
465 466 467 468 469 470
One example of this usage is exchanging events between LPs without really
consuming significant time (for example, to transfer information from a server
to its locally attached network card). It is tempting to use a timestamp of 0,
but this would cause timestamp ties in ROSS. Use of codes\_local\_latency for
timing of local event transitions in this case can be thought of as bus
overhead or context switch overhead.
471 472 473

\subsection{Organizing event structures}

Jonathan Jenkins's avatar
Jonathan Jenkins committed
Since a single event structure contains data for all of the different types of
475 476 477
events processed by the LP, use a type enum + unions (otherwise known as a
``tagged struct'') as an organizational strategy. Keeps the event size down and
makes it a little clearer what variables are used by which event types.
478 479 480

\subsection{Validating across simulation modes}

Jonathan Jenkins's avatar
Jonathan Jenkins committed
481 482 483
During development, you should do test runs with serial, parallel conservative,
and parallel optimistic runs to make sure that you get consistent results.
These modes stress different aspects of the model.
484 485 486

\subsection{How to complete a simulation}

Jonathan Jenkins's avatar
Jonathan Jenkins committed
Most core ROSS examples are design to intentionally hit
the end timestamp for the simulation (i.e. they are modeling a continuous,
Jonathan Jenkins's avatar
Jonathan Jenkins committed
489 490 491
steady state system). This isn't necessarily true for other models. Quite
simply, set g\_tw\_ts\_end to an arbitrary large number when running simulations
that have a well-defined end-point in terms of events processed. 

493 494
Within the LP finalize function, do not call tw\_now. The time returned may not
be consistent in the case of an optimistic simulation.
Philip Carns's avatar
Philip Carns committed

496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531
\section{Best practices quick reference}

NOTE: these may be integrated with the remaining notes or used as a summary of

\subsection{ROSS simulation development}


    \item prefer fine-grained, simple LPs to coarse-grained, complex LPs
        \item can simplify both LP state and reverse computation implementation
        \item ROSS is very good at event processing, likely small difference in

    \item consider separating single-source generation of concurrent events with
        "feedback" events or "continue" events to self
        \item generating multiple concurrent events makes rollback more difficult

    \item use dummy events to work around "event-less" advancement of simulation time 

    \item add a small amount of time "noise" to events to prevent ties

    \item prefer more and smaller events to fewer and larger events
        \item simplifies individual event processing
        \item ROSS uses bounded event structure size in communication, so
            smaller bound $\rightarrow$  less communication overhead

    \item prefer placing state in event structure to LP state structure
        \item simplifies reverse computation -- less persistent state
Jonathan Jenkins's avatar
Jonathan Jenkins committed
        \item NOTE: tradeoff with previous point - consider efficiency vs.\
533 534 535 536 537 538 539 540

    \item try to implement event processing with only LP-local information
        \item reverse computation with collective knowledge is difficult

541 542 543 544 545 546 547
    \item separate ROSS registration from LP configuration functionality

    \item use self-suspend liberally
    \item stash data from destructive operations (floating point computations,
        freed data, re-assigned variables) in the event structure causing the
    \item prefer static memory in LP states to dynamic memory

549 550

551 552 553

Jonathan Jenkins's avatar
Jonathan Jenkins committed
554 555 556
    \item add code examples?
    \item put a pdf or latex2html version of this document on the codes web page
        when it's ready
557 558

Jonathan Jenkins's avatar
Jonathan Jenkins committed
\begin{comment} ==== SCRATCH MATERIAL ====
560 561 562 563 564 565 566 567 568 569 570 571
\begin{lstlisting}[caption=Example code snippet., label=snippet-example]
for (i=0; i<n; i++) {
    for (j=0; j<i; j++) {
        /* do something */

Figure ~\ref{fig:snippet-example} shows an example of how to show a code
snippet in latex.  We can use this format as needed throughout the document.
Jonathan Jenkins's avatar
Jonathan Jenkins committed