darshan-runtime.txt 32.4 KB
Newer Older
1 2 3 4 5
Darshan-runtime installation and usage
======================================

== Introduction

6 7 8
This document describes darshan-runtime, which is the instrumentation
portion of the Darshan characterization tool.  It should be installed on the
system where you intend to collect I/O characterization information.
Philip Carns's avatar
Philip Carns committed
9

10 11 12
Darshan instruments applications via either compile time wrappers for static
executables or dynamic library preloading for dynamic executables.  An
application that has been instrumented with Darshan will produce a single
Kevin Harms's avatar
Kevin Harms committed
13
log file each time it is executed.  This log summarizes the I/O access patterns
14 15
used by the application.

16 17 18 19 20 21 22 23 24 25 26 27 28
The darshan-runtime instrumentation has traditionally only supported MPI
applications (specifically, those that call `MPI_Init()` and `MPI_Finalize()`),
but, as of version 3.2.0, Darshan also supports instrumentation of non-MPI
applications. Regardless of whether MPI is used, Darshan provides detailed
statistics about POSIX level file accesses made by the application.
In the case of MPI applications, Darshan additionally captures detals on MPI-IO
level access, as well as limited information about HDF5 and PnetCDF access.
Note that instrumentation of non-MPI applications is currently only supported
in Darshan's shared library, which applications must `LD_PRELOAD`.

Starting in version 3.0.0, Darshan also exposes an API that can be used to develop
and add new instrumentation modules (for other I/O library interfaces or to gather
system-specific data, for instance), as detailed in
29
http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-modularization.html[this document].
30 31 32 33
Newly contributed modules include a module for gathering system-specific parameters
for jobs running on BG/Q systems, a module for gathering Lustre striping data for
files on Lustre file systems, and a module for instrumenting stdio (i.e., stream I/O
functions like `fopen()`, `fread()`, etc).
34

35
Starting in version 3.1.3, Darshan also allows for full tracing of application I/O
36
workloads using the newly developed Darshan eXtended Tracing (DxT) instrumentation
37 38
module. This module can be selectively enabled at runtime to provide high-fidelity
traces of an application's I/O workload, as opposed to the coarse-grained I/O summary
39 40 41 42
data that Darshan has traditionally provided. Currently, DxT only traces at the POSIX
and MPI-IO layers. Initial link:DXT-overhead.pdf[performance results] demonstrate the
low overhead of DxT tracing, offering comparable performance to Darshan's traditional
coarse-grained instrumentation methods.
43

44 45 46
This document provides generic installation instructions, but "recipes" for
several common HPC systems are provided at the end of the document as well.

47 48 49
More information about Darshan can be found at the 
http://www.mcs.anl.gov/darshan[Darshan web site].

50 51
== Requirements

52
* C compiler (preferrably GCC-compatible)
53 54
* zlib development headers and library

Philip Carns's avatar
Philip Carns committed
55
== Compilation
56

57
.Configure and build example (with MPI support)
58 59 60
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-runtime
61
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
62 63 64 65
make
make install
----

66 67 68 69
.Configure and build example (without MPI support)
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-runtime
70
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID --without-mpi CC=gcc
71 72 73 74
make
make install
----

75
.Explanation of configure arguments:
76
* `--with-mem-align=`: This value is system-dependent and will be
77
used by Darshan to determine if the buffer for a read or write operation is
78
aligned in memory (default is 8).
79
* `--with-jobid-env=` (mandatory): this specifies the environment variable that
80 81 82 83 84
Darshan should check to determine the jobid of a job.  Common values are
`PBS_JOBID` or `COBALT_JOBID`.  If you are not using a scheduler (or your
scheduler does not advertise the job ID) then you can specify `NONE` here.
Darshan will fall back to using the pid of the rank 0 process if the
specified environment variable is not set.
85
* `--with-log-path=` (this, or `--with-log-path-by-env`, is mandatory): This
86
specifies the parent directory for the directory tree where Darshan logs
87 88
will be placed.
* `--with-log-path-by-env=`: specifies an environment variable to use to
89 90 91
determine the log path at run time.
* `--with-log-hints=`: specifies hints to use when writing the Darshan log
file.  See `./configure --help` for details.
92
* `--with-mod-mem=`: specifies the maximum amount of memory (in MiB) that
93
active Darshan instrumentation modules can collectively consume.
94
* `--with-zlib=`: specifies an alternate location for the zlib development
Philip Carns's avatar
Philip Carns committed
95
header and library.
96 97 98
* `CC=`: specifies the C compiler to use for compilation.
* `--without-mpi`: disables MPI support when building Darshan - MPI support is
assumed if not specified.
99
* `--enable-mmap-logs`: enables the use of Darshan's mmap log file mechanism.
100 101 102 103
* `--disable-cuserid`: disables use of cuserid() at runtime.
* `--disable-ld-preload`: disables building of the Darshan LD_PRELOAD library
* `--disable-bgq-mod`: disables building of the BG/Q module (default checks
and only builds if BG/Q environment detected).
104
* `--enable-group-readable-logs`: sets Darshan log file permissions to allow
105
group read access.
Philip Carns's avatar
Philip Carns committed
106 107 108 109
* `--enable-HDF5-pre-1.10`: enables the Darshan HDF5 instrumentation module,
with support for HDF5 versions prior to 1.10
* `--enable-HDF5-post-1.10`: enables the Darshan HDF5 instrumentation module,
with support for HDF5 versions 1.10 or higher
110 111 112

=== Cross compilation

Philip Carns's avatar
Philip Carns committed
113
On some systems (notably the IBM Blue Gene series), the login nodes do not
114 115
have the same architecture or runtime environment as the compute nodes.  In
this case, you must configure darshan-runtime to be built using a cross
Philip Carns's avatar
Philip Carns committed
116
compiler.  The following configure arguments show an example for the BG/P system:
117 118 119 120 121 122 123

----
--host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
----

== Environment preparation

Philip Carns's avatar
Philip Carns committed
124 125
Once darshan-runtime has been installed, you must prepare a location
in which to store the Darshan log files and configure an instrumentation method.
126 127 128

=== Log directory

129
This step can be safely skipped if you configured darshan-runtime using the
Philip Carns's avatar
Philip Carns committed
130 131
`--with-log-path-by-env` option.  A more typical configuration uses a static
directory hierarchy for Darshan log
132 133 134 135 136
files.

The `darshan-mk-log-dirs.pl` utility will configure the path specified at
configure time to include
subdirectories organized by year, month, and day in which log files will be
Philip Carns's avatar
Philip Carns committed
137
placed. The deepest subdirectories will have sticky permissions to enable
138 139 140 141 142 143 144
multiple users to write to the same directory.  If the log directory is
shared system-wide across many users then the following script should be run
as root.
 
----
darshan-mk-log-dirs.pl
----
145

146 147 148
.A note about log directory permissions
[NOTE]
====
149
All log files written by Darshan have permissions set to only allow
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
read access by the owner of the file.  You can modify this behavior,
however, by specifying the --enable-group-readable-logs option at
configure time.  One notable deployment scenario would be to configure
Darshan and the log directories to allow all logs to be readable by both the
end user and a Darshan administrators group.   This can be done with the
following steps:

* set the --enable-group-readable-logs option at configure time
* create the log directories with darshan-mk-log-dirs.pl
* recursively set the group ownership of the log directories to the Darshan
administrators group
* recursively set the setgid bit on the log directories
====


165
=== Instrumentation method
166

167
The instrumentation method to use depends on whether the executables
168
produced by your compiler are statically or dynamically linked.  If you
169 170 171 172
are unsure, you can check by running `ldd <executable_name>` on an example
executable.  Dynamically-linked executables will produce a list of shared
libraries when this command is executed.

173 174
Some compilers allow you to toggle dynamic or static linking via options
such as `-dynamic` or `-static`.  Please check your compiler man page
175
for details if you intend to force one mode or the other.
176

177
== Instrumenting statically-linked MPI applications
178

179 180 181 182
Statically linked executables must be instrumented at compile time.
The simplest methods to do this are to either generate a customized
MPI compiler script (e.g. `mpicc`) that includes the link options and
libraries needed by Darshan, or to use existing profiling configuration
183
hooks for MPI compiler scripts.  Once this is done, Darshan
184
instrumentation is transparent; you simply compile applications using
185
the Darshan-enabled MPI compiler scripts.
186

187 188
=== Using a profile configuration 

189
[[static-prof]]
190
The MPICH MPI implementation supports the specification of a profiling library
191
configuration that can be used to insert Darshan instrumentation without
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207
modifying the existing MPI compiler script.  Example profiling configuration
files are installed with Darshan 2.3.1 and later.  You can enable a profiling
configuration using environment variables or command line arguments to the
compiler scripts:

Example for MPICH 3.1.1 or newer:
----
export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cc
export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx
export MPIFORT_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
----

Example for MPICH 3.1 or earlier:
----
export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cc
export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx
208 209
export MPIF77_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
export MPIF90_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
210 211 212 213 214 215 216 217 218 219
----

Examples for command line use:
----
mpicc -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-c <args>
mpicxx -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx <args>
mpif77 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
mpif90 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
----

220 221
=== Using customized compiler wrapper scripts

222
[[static-wrapper]]
223 224 225 226 227 228 229 230 231 232 233 234
For MPICH-based MPI libraries, such as MPICH1, MPICH2, or MVAPICH,
custom wrapper scripts can be generated to automatically include Darshan
instrumentation.  The following example illustrates how to produce
wrappers for C, C++, and Fortran compilers:

----
darshan-gen-cc.pl `which mpicc` --output mpicc.darshan
darshan-gen-cxx.pl `which mpicxx` --output mpicxx.darshan
darshan-gen-fortran.pl `which mpif77` --output mpif77.darshan
darshan-gen-fortran.pl `which mpif90` --output mpif90.darshan
-----

235 236
=== Other configurations

237 238 239
Please see the Cray recipe in this document for instructions on
instrumenting statically-linked applications on that platform.

240 241 242 243
For other MPI Libraries you must manually modify the MPI compiler scripts to
add the necessary link options and libraries.  Please see the
`darshan-gen-*` scripts for examples or contact the Darshan users mailing
list for help.
244

245
== Instrumenting dynamically-linked MPI applications
246

247
For dynamically-linked executables, Darshan relies on the `LD_PRELOAD`
Philip Carns's avatar
Philip Carns committed
248 249
environment variable to insert instrumentation at run time.  The executables
should be compiled using the normal, unmodified MPI compiler.
250 251

To use this mechanism, set the `LD_PRELOAD` environment variable to the full
252
path to the Darshan shared library. The preferred method of inserting Darshan
253
instrumentation in this case is to set the `LD_PRELOAD` variable specifically
254 255 256
for the application of interest. Typically this is possible using
command line arguments offered by the `mpirun` or `mpiexec` scripts or by
the job scheduler:
257 258

----
259 260 261 262 263 264 265
mpiexec -n 4 -env LD_PRELOAD /home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

----
srun -n 4 --export=LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

266
For sequential invocations of MPI programs, the following will set LD_PRELOAD for process duration only:
267

268
----
269 270 271 272 273
env LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

Other environments may have other specific options for controlling this behavior.
Please check your local site documentation for details.
274

275 276 277 278 279 280 281
It is also possible to just export LD_PRELOAD as follows, but it is recommended
against doing that to prevent Darshan and MPI symbols from being pulled into
unrelated binaries:

----
export LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so
----
282

283 284 285 286 287
[NOTE]
For SGI systems running the MPT environment, it may be necessary to set the `MPI_SHEPHERD`
environment variable equal to `true` to avoid deadlock when preloading the Darshan shared
library.

288 289 290 291
=== Instrumenting dynamically-linked Fortran applications

Please follow the general steps outlined in the previous section.  For
Fortran applications compiled with MPICH you may have to take the additional
Philip Carns's avatar
Philip Carns committed
292
step of adding
293 294 295
`libfmpich.so` to your `LD_PRELOAD` environment variable. For example:

----
296
export LD_PRELOAD=/path/to/mpi/used/by/executable/lib/libfmpich.so:/home/carns/darshan-install/lib/libdarshan.so
297
----
298

299 300 301 302 303
[NOTE]
The full path to the libfmpich.so library can be omitted if the rpath
variable points to the correct path.  Be careful to check the rpath of the
darshan library and the executable before using this configuration, however.
They may provide conflicting paths.  Ideally the rpath to the  MPI library
304
would *not* be set by the Darshan library, but would instead be specified
305 306 307 308
exclusively by the executable itself.  You can check the rpath of the
darshan library by running `objdump -x
/home/carns/darshan-install/lib/libdarshan.so |grep RPATH`.

309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337
== Instrumenting dynamically-linked non-MPI applications

Similar to the process described in the previous section, Darshan relies on the
`LD_PRELOAD` mechanism for instrumenting dynamically-linked non-MPI applications.
This allows Darshan to instrument dynamically-linked binaries produced by non-MPI
compilers (e.g., gcc or clang), extending Darshan instrumentation to new contexts
(like instrumentation of arbitrary Python programs or instrumenting serial
file transfer utilities like `cp` and `scp`).

The only additional step required of Darshan non-MPI users is to also set the
DARSHAN_ENABLE_NONMPI environment variable to signal to Darshan that non-MPI
instrumentation is requested:

----
export DARSHAN_ENABLE_NONMPI=1
----

As described in the previous section, it may be desirable to users to limit the
scope of Darshan's instrumentation by only enabling LD_PRELOAD on the target
executable:

----
env LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so io-test
----

[NOTE]
Recall that Darshan instrumentation of non-MPI applications is only possible with 
dynamically-linked applications.

338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387
== Using the Darshan eXtended Tracing (DXT) module

DXT support is disabled by default in Darshan, requiring the user to either explicitly
enable tracing for all files or to provide a trace trigger configuration file describing
which files should be traced at runtime.

To enable tracing globally for all files, Darshan users simply need to set the
DXT_ENABLE_IO_TRACE environment variable as follows:

----
export DXT_ENABLE_IO_TRACE=1
----

To enable tracing for particular files, DXT additionally offers a trace
triggering mechansim, with users specifying triggers used to decide whether or
not to trace a particular file at runtime. Files that do not match any trace
trigger will not store trace data in the Darshan log. Currently, DXT supports
the following types of trace triggers:

* file triggers: trace files based on regex matching of file paths
* rank triggers: trace files based on regex matching of ranks
* dynamic triggers: trace files based on runtime analysis of I/O characteristics (e.g., frequent small or unaligned I/O accesses)

Users simply need to specify one or more of these triggers in a text file that is passed
to DXT at runtime -- when multiple triggers are specified, DXT will keep any file traces
that match at least one trigger (i.e., the trace decision is a logical OR accross given triggers).
An example configuration file is given below, illustrating the syntax to use for currently
supported triggers:

----
FILE .h5$           # trace all files with a '.h5' extension
FILE ^/tmp          # trace all files with a path prefix of '/tmp'
RANK [1-2]          # trace all files accessed by ranks 1-2
SMALL_IO .5         # trace all files with greater than 50% small (less than 10 KB) accesses
UNALIGNED_IO .5     # trace all files with greater than 50% unaligned accesses
----

FILE and RANK triggers take a single parameter representing the regex that will be compared
to the file name and the rank accessing it, respectively. Regex support is provided by the
POSIX `regex.h` interface -- refer to the https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/regex.h.html[manpage] for more details on regex syntax.
SMALL_IO and UNALIGNED_IO triggers take a single parameter representing the lower
threshold percentage of accesses of the given type.

Set the DXT_TRIGGER_CONF_PATH environment variable to notify DXT of the path of the
configuration file:

----
export DXT_TRIGGER_CONF_PATH=/path/to/dxt/config/file
----

388 389
== Darshan installation recipes

Philip Carns's avatar
Philip Carns committed
390
The following recipes provide examples for prominent HPC systems.
Philip Carns's avatar
Philip Carns committed
391
These are intended to be used as a starting point.  You will most likely have to adjust paths and options to
392 393
reflect the specifics of your system.

394
=== IBM Blue Gene (BG/P or BG/Q)
395

396
IBM Blue Gene systems produces static executables by default, uses a
397 398 399 400 401 402 403 404 405
different architecture for login and compute nodes, and uses an MPI
environment based on MPICH.

The following example shows how to configure Darshan on a BG/P system:

----
./configure --with-mem-align=16 \
 --with-log-path=/home/carns/working/darshan/releases/logs \
 --prefix=/home/carns/working/darshan/install --with-jobid-env=COBALT_JOBID \
406
 --with-zlib=/soft/apps/zlib-1.2.3/ \
407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423
 --host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
----

.Rationale
[NOTE]
====
The memory alignment is set to 16 not because that is the proper alignment
for the BG/P CPU architecture, but because that is the optimal alignment for
the network transport used between compute nodes and I/O nodes in the
system.  The jobid environment variable is set to `COBALT_JOBID` in this
case for use with the Cobalt scheduler, but other BG/P systems may use
different schedulers.  The `--with-zlib` argument is used to point to a
version of zlib that has been compiled for use on the compute nodes rather
than the login node.  The `--host` argument is used to force cross-compilation
of Darshan.  The `CC` variable is set to point to a stock MPI compiler.
====

424 425 426 427 428 429 430 431 432 433 434 435
Once Darshan has been installed, you can use one of the static
instrumentation methods described earlier in this document.  If you
use the profiling configuration file method, then please note that the
Darshan installation includes profiling configuration files that have been
adapted specifically for the Blue Gene environment.  Set the following
environment variables to enable them, and then use your normal compiler
scripts.  This method is compatible with both GNU and IBM compilers.

Blue Gene profiling configuration example:
----
export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-cc
export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-cxx
436 437
export MPIF77_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-f
export MPIF90_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-f
438
----
439

440
=== Cray platforms (XE, XC, or similar)
441

442 443 444 445 446 447 448
The Cray programming environment produces static executables by default,
which means that Darshan instrumentation must be inserted at compile
time.  This can be accomplished by loading a software module that sets
appropriate environment variables to modify the Cray compiler script link
behavior.  This section describes how to compile and install Darshan,
as well as how to use a software module to enable and disable Darshan
instrumentation.
449 450 451 452 453 454

==== Building and installing Darshan

Please set your environment to use the GNU programming environment before
configuring or compiling Darshan.  Although Darshan can be built with a
variety of compilers, the GNU compilers are recommended because it will
455
produce a Darshan library that is interoperable with the widest range
456
of compilers and linkers.  On most Cray systems you can enable the GNU
457 458 459
programming environment with a command similar to "module swap PrgEnv-pgi
PrgEnv-gnu".  Please see your site documentation for information about
how to switch programming environments.
460 461

The following example shows how to configure and build Darshan on a Cray
462
system using the GNU programming environment.  Adjust the 
463 464
--with-log-path and --prefix arguments to point to the desired log file path 
and installation path, respectively.
465 466

----
467
module swap PrgEnv-pgi PrgEnv-gnu
468
./configure \
469 470
 --with-log-path=/shared-file-system/darshan-logs \
 --prefix=/soft/darshan-2.2.3 \
471
 --with-jobid-env=PBS_JOBID --disable-cuserid CC=cc
472 473
make install
module swap PrgEnv-gnu PrgEnv-pgi
474 475 476 477 478 479 480
----

.Rationale
[NOTE]
====
The job ID is set to `PBS_JOBID` for use with a Torque or PBS based scheduler.
The `CC` variable is configured to point the standard MPI compiler.
Philip Carns's avatar
Philip Carns committed
481 482 483 484 485 486 487 488

The --disable-cuserid argument is used to prevent Darshan from attempting to
use the cuserid() function to retrieve the user name associated with a job.
Darshan automatically falls back to other methods if this function fails,
but on some Cray environments (notably the Beagle XE6 system as of March 2012)
the cuserid() call triggers a segmentation fault.  With this option set,
Darshan will typically use the LOGNAME environment variable to determine a
userid.
489 490
====

491 492 493
As in any Darshan installation, the darshan-mk-log-dirs.pl script can then be 
used to create the appropriate directory hierarchy for storing Darshan log 
files in the --with-log-path directory.
Philip Carns's avatar
Philip Carns committed
494

495 496 497 498 499
Note that Darshan is not currently capable of detecting the stripe size
(and therefore the Darshan FILE_ALIGNMENT value) on Lustre file systems.
If a Lustre file system is detected, then Darshan assumes an optimal
file alignment of 1 MiB.

500
==== Enabling Darshan instrumentation 
501

502 503
Darshan will automatically install example software module files in the
following locations (depending on how you specified the --prefix option in
504
the previous section):
Philip Carns's avatar
Philip Carns committed
505

506
----
507
/soft/darshan-2.2.3/share/craype-1.x/modulefiles/darshan
508
/soft/darshan-2.2.3/share/craype-2.x/modulefiles/darshan
509
----
Philip Carns's avatar
Philip Carns committed
510

511 512
Select the one that is appropriate for your Cray programming environment
(see the version number of the craype module in `module list`).
Philip Carns's avatar
Philip Carns committed
513

514 515 516 517 518 519 520 521 522 523 524 525 526
If you are using the Cray Programming Environment version 1.x, then you
must modify the corresponding modulefile before using it.  Please see
the comments at the end of the file and choose an environment variable
method that is appropriate for your system.  If this is not done, then
the compiler may fail to link some applications when the Darshan module
is loaded.

If you are using the Cray Programming Environment version 2.x then you can
likely use the modulefile as is.  Note that it pulls most of its
configuration from the lib/pkgconfig/darshan-runtime.pc file installed with
Darshan.

The modulefile that you select can be copied to a system location, or the
527 528
install location can be added to your local module path with the following
command:
Philip Carns's avatar
Philip Carns committed
529

530
----
531
module use /soft/darshan-2.2.3/share/craype-<VERSION>/modulefiles
532
----
Philip Carns's avatar
Philip Carns committed
533

534 535
From this point, Darshan instrumenation can be enabled for all future
application compilations by running "module load darshan".
536 537 538 539 540 541 542

=== Linux clusters using Intel MPI 

Most Intel MPI installations produce dynamic executables by default.  To
configure Darshan in this environment you can use the following example:

----
543
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
544 545 546 547 548 549 550 551 552 553 554 555 556 557
----

.Rationale
[NOTE]
====
There is nothing unusual in this configuration except that you should use
the underlying GNU compilers rather than the Intel ICC compilers to compile
Darshan itself.
====

You can use the `LD_PRELOAD` method described earlier in this document to
instrument executables compiled with the Intel MPI compiler scripts.  This
method has been briefly tested using both GNU and Intel compilers.

558
=== Linux clusters using MPICH 
559

560 561 562 563 564 565 566
Follow the generic instructions provided at the top of this document.  For MPICH versions 3.1 and
later, MPICH uses shared libraries by default, so you may need to consider the dynamic linking
instrumentation approach.  

The static linking method can be used if MPICH is configured to use static
linking by default, or if you are using a version prior to 3.1.
The only modification is to make sure that the `CC` used for compilation is
567 568 569
based on a GNU compiler.  Once Darshan has been installed, it should be
capable of instrumenting executables built with GNU, Intel, and PGI
compilers.
Philip Carns's avatar
Philip Carns committed
570

571 572 573 574 575 576 577
[NOTE]
MPICH versions 3.1, 3.1.1, 3.1.2, and 3.1.3 may produce link-time errors when building static
executables (i.e. using the -static option) if MPICH is built with shared library support.
Please see http://trac.mpich.org/projects/mpich/ticket/2190 for more details.  The workaround if you
wish to use static linking is to configure MPICH with `--enable-shared=no --enable-static=yes` to
force it to use static MPI libraries with correct dependencies.

578 579 580 581 582 583 584 585 586 587 588 589 590
=== Linux clusters using Open MPI

Follow the generic instructions provided at the top of this document for
compilation, and make sure that the `CC` used for compilation is based on a
GNU compiler.

Open MPI typically produces dynamically linked executables by default, which
means that you should use the `LD_PRELOAD` method to instrument executables
that have been built with Open MPI.  Darshan is only compatible with Open
MPI 1.6.4 and newer.  For more details on why Darshan is not compatible with
older versions of Open MPI, please refer to the following mailing list discussion:

http://www.open-mpi.org/community/lists/devel/2013/01/11907.php
591

592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608
== Upgrading to Darshan 3.x from 2.x

Beginning with Darshan 3.0.0, Darshan has been rewritten to modularize its runtime environment
and log file format to simplify the addition of new I/O characterization data. The process of
compiling and installing the Darshan 3.x source code should essentially be identical to this
process on Darshan 2.x. Therefore, the installation recipes given in the previous section
should work irrespective of the Darshan version being used. Similarly, the manner in which
Darshan is used should be the same across versions -- the sections in this document regarding
Darshan link:darshan-runtime.html#_environment_preparation[environment preparation],
instrumenting link:darshan-runtime.html#_instrumenting_statically_linked_applications[statically
linked applications] and link:darshan-runtime.html#_instrumenting_dynamically_linked_applications[
dynamically linked applications], and using link:darshan-runtime.html#_runtime_environment_variables[
runtime environment variables] are equally applicable to both versions.

However, we do provide some suggestions and expectations for system administrators to keep in
mind when upgrading to Darshan 3.x:

609 610 611 612 613 614
* Darshan 2.x and Darshan 3.x produce incompatible log file formats
    ** log files named *.darshan.gz or *.darshan.bz2: Darshan 2.x format
    ** log files named *.darshan: Darshan 3.x format
        *** a field in the log file header indicates underlying compression method in 3.x logs
    ** There is currently no tool for converting 2.x logs into the 3.x log format.
    ** The `darshan-logutils` library will provide error messages to indicate whether a given
Shane Snyder's avatar
Shane Snyder committed
615
log file is incompatible with the correspnonding library version. 
616 617 618

* We encourage administrators to use the same log file directory for version 3.x as had been
used for version 2.x.
619
    ** Within this directory, the determination on which set of log utilities (version 2.x
620
or version 3.x) to use can be based on the file extension for a given log (as explained
Shane Snyder's avatar
Shane Snyder committed
621
above).
622

623 624 625 626 627 628 629 630
== Runtime environment variables

The Darshan library honors the following environment variables to modify
behavior at runtime:

* DARSHAN_DISABLE: disables Darshan instrumentation
* DARSHAN_INTERNAL_TIMING: enables internal instrumentation that will print the time required to startup and shutdown Darshan to stderr at run time.
* DARSHAN_LOGHINTS: specifies the MPI-IO hints to use when storing the Darshan output file.  The format is a semicolon-delimited list of key=value pairs, for example: hint1=value1;hint2=value2
631
* DARSHAN_MEMALIGN: specifies a value for system memory alignment
632
* DARSHAN_JOBID: specifies the name of the environment variable to use for the job identifier, such as PBS_JOBID
633
* DARSHAN_DISABLE_SHARED_REDUCTION: disables the step in Darshan aggregation in which files that were accessed by all ranks are collapsed into a single cumulative file record at rank 0.  This option retains more per-process information at the expense of creating larger log files. Note that it is up to individual instrumentation module implementations whether this environment variable is actually honored.
634
* DARSHAN_LOGPATH: specifies the path to write Darshan log files to. Note that this directory needs to be formatted using the darshan-mk-log-dirs script.
635
* DARSHAN_LOGFILE: specifies the path (directory + Darshan log file name) to write the output Darshan log to. This overrides the default Darshan behavior of automatically generating a log file name and adding it to a log file directory formatted using darshan-mk-log-dirs script.
636
* DARSHAN_MODMEM: specifies the maximum amount of memory (in MiB) Darshan instrumentation modules can collectively consume at runtime (if not specified, Darshan uses a default quota of 2 MiB).
637
* DARSHAN_MMAP_LOGPATH: if Darshan's mmap log file mechanism is enabled, this variable specifies what path the mmap log files should be stored in (if not specified, log files will be stored in `/tmp`).
638
* DARSHAN_EXCLUDE_DIRS: specifies a list of comma-separated paths that Darshan will not instrument at runtime (in addition to Darshan's default blacklist)
639 640 641
* DXT_ENABLE_IO_TRACE: setting this environment variable enables the DXT (Darshan eXtended Tracing) modules at runtime for all files instrumented by Darshan. Currently, DXT is hard-coded to use a maximum of 4 MiB of trace memory per process (in addition to memory used by other modules).
* DXT_DISABLE_IO_TRACE: setting this environment variable disables the DXT module at runtime for all files instrumented by Darshan.
* DXT_TRIGGER_CONF_PATH: File path to a DXT trace trigger configuration file, which specifies triggers used by DXT to decide which files to trace at runtime. Note that the trace triggering mechanism is overridden by the DXT_ENABLE_IO_TRACE and DXT_DISABLE_IO_TRACE environment variables.
642
* DARSHAN_ENABLE_NONMPI: setting this environment variable is required to generate Darshan logs for non-MPI applications
643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686

== Debugging

=== No log file

In cases where Darshan is not generating a log file for an application, some common things to check are:

* Check stderr to ensure Darshan isn't indicating any internal errors (e.g., invalid log file path)

For statically linked executables:

* Ensure that Darshan symbols are present in the underlying executable by running `nm` on it:
----
> nm test | grep darshan
0000000000772260 b darshan_core
0000000000404440 t darshan_core_cleanup
00000000004049b0 T darshan_core_initialize
000000000076b660 d darshan_core_mutex
00000000004070a0 T darshan_core_register_module
----

* Make sure the application executable is statically linked:
    ** In general, we encourage the use of purely statically linked executables when using the static
instrumentation method given in link:darshan-runtime.html#_instrumenting_statically_linked_applications[Section 5]
    ** If purely static executables are not an option, we encourage users to use the LD_PRELOAD method of
instrumentation given in link:darshan-runtime.html#_instrumenting_dynamically_linked_applications[Section 6]
    ** Statically linked executables are the default on Cray platforms and the IBM BG platforms; 
statically linked executables can be explicitly requested using the `-static` compile option to most compilers
    ** You can verify that an executable is purely statically linked by using the `file` command:
----
> file mpi-io-test
mpi-io-test: ELF 64-bit LSB  executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=9893e599e7a560159ccf547b4c4ba5671f65ba32, not stripped
----

* Ensure that the linker is correctly linking in Darshan's runtime libraries:
    ** A common mistake is to explicitly link in the underlying MPI libraries (e.g., `-lmpich` or `-lmpichf90`)
in the link command, which can interfere with Darshan's instrumentation
        *** These libraries are usually linked in automatically by the compiler
        *** MPICH's `mpicc` comipler's `-show` flag can be used to examine the invoked link command, for instance
    ** The linker's `-y` option can be used to verify that Darshan is properly intercepting MPI_Init
function (e.g. by setting `CFLAGS='-Wl,-yMPI_Init'`), which it uses to initialize its runtime structures
----
/usr/common/software/darshan/3.0.0-pre3/lib/libdarshan.a(darshan-core-init-finalize.o): definition of MPI_Init
----