codes-best-practices.tex 16.7 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

% Add the compsoc option for Computer Society conferences.
% If IEEEtran.cls has not been installed into the LaTeX system files,
% manually specify the path to it like:
% \documentclass[conference]{../sty/IEEEtran}

% cite.sty was written by Donald Arseneau
% V1.6 and later of IEEEtran pre-defines the format of the cite.sty package
% \cite{} output to follow that of IEEE. Loading the cite package will
% result in citation numbers being automatically sorted and properly
% "compressed/ranged". e.g., [1], [9], [2], [7], [5], [6] without using
% cite.sty will become [1], [2], [5]--[7], [9] using cite.sty. cite.sty's
% \cite will automatically add leading space, if needed. Use cite.sty's
% noadjust option (cite.sty V3.8 and later) if you want to turn this off.
% cite.sty is already installed on most LaTeX systems. Be sure and use
% version 4.0 (2003-05-27) and later if using hyperref.sty. cite.sty does
% not currently provide for hyperlinked citations.
% The latest version can be obtained at:
% The documentation is contained in the cite.sty file itself.
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

\lstset{ %

51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171

% declare the path(s) where your graphic files are
% and their extensions so you won't have to specify these with
% every instance of \includegraphics

 % \usepackage[pdftex]{graphicx}
 % declare the path(s) where your graphic files are
 % \graphicspath{{../pdf/}{../jpeg/}}
 % and their extensions so you won't have to specify these with
 % every instance of \includegraphics
 % \DeclareGraphicsExtensions{.pdf,.jpeg,.png}
 % or other class option (dvipsone, dvipdf, if not using dvips). graphicx
 % will default to the driver specified in the system graphics.cfg if no
 % driver is specified.
 % \usepackage[dvips]{graphicx}
 % declare the path(s) where your graphic files are
 % \graphicspath{{../eps/}}
 % and their extensions so you won't have to specify these with
 % every instance of \includegraphics
 % \DeclareGraphicsExtensions{.eps}

% Frank Mittelbach's and David Carlisle's array.sty patches and improves
% the standard LaTeX2e array and tabular environments to provide better
% appearance and additional user controls. As the default LaTeX2e table
% generation code is lacking to the point of almost being broken with
% respect to the quality of the end results, all users are strongly
% advised to use an enhanced (at the very least that provided by array.sty)
% set of table tools. array.sty is already installed on most systems. The
% latest version and documentation can be obtained at:

% Also of notable interest is Scott Pakin's eqparbox package for creating
% (automatically sized) equal width boxes - aka "natural width parboxes".
% Available at:


% \usepackage{subfigure}
% subfigure.sty was written by Steven Douglas Cochran. This package makes it
% easy to put subfigures in your figures. e.g., "Figure 1a and 1b". For IEEE
% work, it is a good idea to load it with the tight package option to reduce
% the amount of white space around the subfigures. subfigure.sty is already
% installed on most LaTeX systems. The latest version and documentation can
% be obtained at:
% subfigure.sty has been superceeded by subfig.sty.

% subfig.sty, also written by Steven Douglas Cochran, is the modern
% replacement for subfigure.sty. However, subfig.sty requires and
% automatically loads Axel Sommerfeldt's caption.sty which will override
% IEEEtran.cls handling of captions and this will result in nonIEEE style
% figure/table captions. To prevent this problem, be sure and preload
% caption.sty with its "caption=false" package option. This is will preserve
% IEEEtran.cls handing of captions. Version 1.3 (2005/06/28) and later 
% (recommended due to many improvements over 1.2) of subfig.sty supports
% the caption=false option directly:
% The latest version and documentation can be obtained at:
% The latest version and documentation of caption.sty can be obtained at:

% url.sty was written by Donald Arseneau. It provides better support for
% handling and breaking URLs. url.sty is already installed on most LaTeX
% systems. The latest version can be obtained at:
% Read the url.sty source comments for usage information. Basically,
% \url{my_url_here}.

% *** Do not adjust lengths that control margins, column widths, etc. ***
% *** Do not use packages that alter fonts (such as pslatex).         ***
% There should be no need to do such things with IEEEtran.cls V1.6 and later.
% (Unless specifically asked to do so by the journal or conference you plan
% to submit to, of course. )

% correct bad hyphenation here
\hyphenation{op-tical net-works semi-conduc-tor}

\title{CODES Best Practices}

%\author{\IEEEauthorblockN{Someone\IEEEauthorrefmark{1}} \\

%\numberofauthors{6} %  in this sample file, there are a *total*

% use for special paper notices
%\IEEEspecialpapernotice{(Invited Paper)}

% use arabic rather than roman numerals for table references

% make the title area

172 173
This document outlines best practices for developing models in the
CODES/ROSS framework.  The reader should already be familiar with ROSS
Philip Carns's avatar
Philip Carns committed
and discrete event simulation in general; those topics are covered in the primary
175 176 177 178 179 180
ROSS documentation.
The main purpose of this document is to help the reader produce
CODES models in a consistent, modular style so that componets can be more
easily shared and reused.  It also includes a few tips to help avoid common
simulation bugs.
181 182

\section{CODES: modularizing models}

185 186 187
This section covers some of the basic principles of how to organize model
components to be more modular and easier to reuse across CODES models.

\subsection{Units of time}

190 191 192 193 194 195 196 197
ROSS does not dictate the units to be used in simulation timestamps.
The \texttt{tw\_stime} type is a double precision
floating point number that could represent any time unit
(e.g. days, hours, seconds, nanoseconds, etc.).  When building CODES
models you should \emph{always treat timestamps as nanoseconds}, however.
All components within a model must agree on the time units in order to
advance simulation time consistently.  Several common utilities in the
CODES project expect to operate in terms of nanoseconds.

199 200
\subsection{Organizing models by LP types}

201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229
ROSS allows you to use as many different LP types as you would like to
construct your models.  Try to take advantage of this as much as possible by
organizing your simulation so that each component of the system that you are
modeling is implemented within its own LP type.  For example, a storage
system model might use different LPs for hard disks, clients, network
adapters, and servers.  There are multiple reasons for dividing up models
like this:

\item General modularity: makes it easier to pull out particular components
(for example, a disk model) for use in other models.
\item Simplicitity: if each LP type is only handling a limited set of
events, then the event structure, state structure, and event handler
functions will all be much smaller and easier to understand.
\item Reverse computation: it makes it easier to implement reverse
computation, not only because the code is simpler, but also because you can
implement and test reverse computation per component rather than having to
apply it to an entire model all at once before testing.

It is also important to note that you can divide up models not just by
hardware components, but also by functionality, just as
you would modularize the implementation of a distributed file system.  For
example, a storage daemon might include separate LPs for replication, failure
detection, and reconstruction.  Each of those LPs can share the same network
card and disk resources for accurate modeling of resource usage.  They key
reason for splitting them up is to simplify the model and to encourage

Philip Carns's avatar
Philip Carns committed
230 231 232 233 234 235 236 237 238 239 240 241
One hypothetical downside to splitting up models into multiple LP types is that it likely
means that your model will generate more events than a monolithic model
would have.  Remember that \emph{ROSS is really efficient at generating and
processing events}, though!  It is usually a premature optimization to try to optimize a model by
replacing events with function calls in cases where you know the necessary
data is available on the local MPI process.  Also recall that any information
exchanged via event automatically benefits by shifting burden for
tracking/retaining event data and event ordering into ROSS rather than your
model.  This can help simplify reverse computation by breaking complex
operations into smaller, easier to understand (and reverse) event units with
deterministic ordering.

242 243
TODO: reference example, for now see how the LPs are organized in Triton
244 245 246

\subsection{Protecting data structures}

247 248
Once you have organized a model into separate LP types, it is tempting to
transfer information between them by directly sending events to an LP or by
Philip Carns's avatar
Philip Carns committed
249 250 251 252
modifying the state of an LP from a different LP type.  This approach
entangles the components,
however, such that one LP is dependent upon the internal architecture of
another LP.  If you change one LP then you have to take care that you don't
253 254 255 256
break assumptions in other LPs that use their event or state structures.  This causes
problems for reuse.  It also means (even if you don't plan to reuse an
LP) that incompatibilities will be difficult to detect at compile time; the
compiler has no way to know which fields in a struct must be set before
Philip Carns's avatar
Philip Carns committed
257 258
sending an event.  Event structures are not a good public API for exchanging
information between different LP types.

Philip Carns's avatar
Philip Carns committed
260 261
For these reasons we encourage that all event structs and state structs
be defined (and accessible) only within source file for the LP that
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277
must use those structs.  They should not be exposed in external
headers.  If the definitions are placed in a header then it makes it
possible for those event and state structs to be used as an ad-hoc interface
between LPs.

Section~\ref{sec:completion} will describe alternatives for communicating
information between LP types.

TODO: reference example, for now see how structs are defined in Triton

\subsection{Techniques for exchanging information and completion events
across LP types}

TODO: fill this in.

279 280
Send events into an LP using a C function API that calls event\_new under
the covers.

282 283 284 285
Indicate completion back to the calling LP by either delivering an opaque 
message back to the calling LP (that was passed in by the caller in a void*
argument), or by providing an API function for 2nd LP type to
use to call back (show examples of both).
286 287 288 289

\section{CODES: common utilities}


TODO: pull in Misbah's codes-mapping documentation.
293 294 295


296 297 298
TODO: fill this in.  Modelnet is a network abstraction layer for use in
CODES models.  It provides a consistent API that can be used to send
messages between nodes using a variety of different network transport
Philip Carns's avatar
Philip Carns committed
299 300
models.  Note that modelnet requires the use of the codes-mapping API,
described in previous section.

302 303

304 305 306 307 308 309 310 311 312 313 314 315 316
TODO: fill this in.  lp-io is a simple API for storing modest-sized
simulation results (not continous traces).  It handles reverse computation
and avoids doing any disk I/O until the simulation is complete.  All data is
written with collective I/O into a unified output directory.


TODO: fill this in.  codes\_event\_new is a small wrapper to tw\_event\_new
that checks the incoming timestamp and makes sure that you don't exceed the
global end timestamp for ROSS.  The assumption is that CODES models will
normally run to a completion condition rather than until simulation time
runs out, see later section for more information on this approach.

317 318
\section{CODES: reproducability and model safety}

319 320 321 322
TODO: fill this in.  These are things that aren't required for modularity,
but just help you create models that produce consistent results and avoid
some common bugs.

323 324
\subsection{Event magic numbers}

325 326 327 328
TODO: fill this in.  Put magic numbers at the top of each event struct and
check them in event handler.  This makes sure that you don't accidentally
send the wrong event type to an LP.

329 330
\subsection{Small timestamps for LP transitions}

331 332 333 334 335 336 337
TODO: fill this in.  Sometimes you need to exchange events between LPs
without really consuming significant time (for example, to transfer
information from a server to its locally attached network card).  It is
tempting to use a timestamp of 0, but this causes timestamp ties in ROSS
which might have a variety of unintended consequences.  Use
codes\_local\_latency for timing of local event transitions to add some
random noise, can be thought of as bus overhead or context switch overhead.
338 339 340 341 342

\section{ROSS: general tips}

\subsection{Organizing event structures}

343 344 345 346 347 348 349 350 351 352
TODO: fill this in.  The main idea is to use unions to organize fields
within event structures.  Keeps the size down and makes it a little clearer
what variables are used by which event types.

\subsection{Avoiding event timestamp ties}

TODO: fill this in.   Why ties are bad (hurts reproducability, if not
accuracy, which in turn makes correctness testing more difficult).  Things
you can do to avoid ties, like skewing initial events by a random number
353 354 355

\subsection{Validating across simulation modes}

356 357 358 359
TODO: fill this in.  The general idea is that during development you should
do test runs with serial, parallel conservative, and parallel optimistic
runs to make sure that you get consistent results.  These modes stress
different aspects of the model.
360 361 362

\subsection{Reverse computation}

363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381
TODO: fill this in.  General philosophy of when the best time to add reverse
computation is (probably not in your initial rough draft prototype, but it
is best to go ahead and add it before the model is fully complete or else it
becomes too daunting/invasive).

Things you can do to make it easier: rely on ordering enforced by ROSS (each
reverse handler only needs to reverse as single event, in order), keeping functions small, building
internal APIs for managing functions with reverse functions, how to handle
queues, etc.).  Might need some more subsubsections to break this up.

\subsection{How to complete a simulation}

TODO: fill this in.  Most core ROSS examples are design to intentionally hit
the end timestamp for the simulation (i.e. they are modeling a continuous,
steady state system).  This isn't necessarily true when modeling a
distributed storage system.  You might instead want the simulation to end
when you have completed a particular application workload (or collection of
application workloads), when a fault has been repaired, etc.  Talk about how
to handle this cleanly.
382 383 384 385


386 387
\item Build a single example model that demonstrates the concepts in this
document, refer to it throughout.
388 389 390
\item reference to ROSS user's guide, airport model, etc.
\item put a pdf or latex2html version of this document on the codes web page
when ready
391 392

393 394 395 396 397 398 399 400 401 402 403 404 405
\begin{lstlisting}[caption=Example code snippet., label=snippet-example]
for (i=0; i<n; i++) {
    for (j=0; j<i; j++) {
        /* do something */

Figure ~\ref{fig:snippet-example} shows an example of how to show a code
snippet in latex.  We can use this format as needed throughout the document.