darshan-runtime.txt 29.5 KB
Newer Older
1
2
3
4
5
Darshan-runtime installation and usage
======================================

== Introduction

6
7
8
This document describes darshan-runtime, which is the instrumentation
portion of the Darshan characterization tool.  It should be installed on the
system where you intend to collect I/O characterization information.
Philip Carns's avatar
Philip Carns committed
9

10
11
12
13
Darshan instruments applications via either compile time wrappers or
dynamic library preloading.  An application that has been instrumented
with Darshan will produce a single log file each time it is executed.
This log summarizes the I/O access patterns used by the application.
14

15
16
17
18
19
The darshan-runtime instrumentation has traditionally only supported MPI
applications (specifically, those that call `MPI_Init()` and `MPI_Finalize()`),
but, as of version 3.2.0, Darshan also supports instrumentation of non-MPI
applications. Regardless of whether MPI is used, Darshan provides detailed
statistics about POSIX level file accesses made by the application.
20
In the case of MPI applications, Darshan additionally captures details on MPI-IO
21
22
23
24
25
26
27
level access, as well as limited information about HDF5 and PnetCDF access.
Note that instrumentation of non-MPI applications is currently only supported
in Darshan's shared library, which applications must `LD_PRELOAD`.

Starting in version 3.0.0, Darshan also exposes an API that can be used to develop
and add new instrumentation modules (for other I/O library interfaces or to gather
system-specific data, for instance), as detailed in
28
http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-modularization.html[this document].
29
30
31
32
Newly contributed modules include a module for gathering system-specific parameters
for jobs running on BG/Q systems, a module for gathering Lustre striping data for
files on Lustre file systems, and a module for instrumenting stdio (i.e., stream I/O
functions like `fopen()`, `fread()`, etc).
33

34
Starting in version 3.1.3, Darshan also allows for full tracing of application I/O
35
workloads using the newly developed Darshan eXtended Tracing (DxT) instrumentation
36
37
module. This module can be selectively enabled at runtime to provide high-fidelity
traces of an application's I/O workload, as opposed to the coarse-grained I/O summary
38
39
40
41
data that Darshan has traditionally provided. Currently, DxT only traces at the POSIX
and MPI-IO layers. Initial link:DXT-overhead.pdf[performance results] demonstrate the
low overhead of DxT tracing, offering comparable performance to Darshan's traditional
coarse-grained instrumentation methods.
42

43
44
45
This document provides generic installation instructions, but "recipes" for
several common HPC systems are provided at the end of the document as well.

46
More information about Darshan can be found at the
47
48
http://www.mcs.anl.gov/darshan[Darshan web site].

49
50
== Requirements

51
* C compiler (preferably GCC-compatible)
52
53
* zlib development headers and library

Philip Carns's avatar
Philip Carns committed
54
== Compilation
55

56
.Configure and build example (with MPI support)
57
58
59
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-runtime
60
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
61
62
63
64
make
make install
----

65
66
67
68
.Configure and build example (without MPI support)
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-runtime
69
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID --without-mpi CC=gcc
70
71
72
73
make
make install
----

74
.Explanation of configure arguments:
75
* `--with-mem-align=`: This value is system-dependent and will be
76
used by Darshan to determine if the buffer for a read or write operation is
77
aligned in memory (default is 8).
78
* `--with-jobid-env=` (mandatory): this specifies the environment variable that
79
80
81
82
83
Darshan should check to determine the jobid of a job.  Common values are
`PBS_JOBID` or `COBALT_JOBID`.  If you are not using a scheduler (or your
scheduler does not advertise the job ID) then you can specify `NONE` here.
Darshan will fall back to using the pid of the rank 0 process if the
specified environment variable is not set.
84
* `--with-log-path=` (this, or `--with-log-path-by-env`, is mandatory): This
85
specifies the parent directory for the directory tree where Darshan logs
86
will be placed.
87
** NOTE: after installation, any user can display the configured path with the `darshan-config --log-path` command
88
* `--with-log-path-by-env=`: specifies an environment variable to use to
89
90
91
determine the log path at run time.
* `--with-log-hints=`: specifies hints to use when writing the Darshan log
file.  See `./configure --help` for details.
92
* `--with-mod-mem=`: specifies the maximum amount of memory (in MiB) that
93
active Darshan instrumentation modules can collectively consume.
94
* `--with-zlib=`: specifies an alternate location for the zlib development
Philip Carns's avatar
Philip Carns committed
95
header and library.
96
97
98
* `CC=`: specifies the C compiler to use for compilation.
* `--without-mpi`: disables MPI support when building Darshan - MPI support is
assumed if not specified.
99
* `--enable-mmap-logs`: enables the use of Darshan's mmap log file mechanism.
100
101
102
103
* `--disable-cuserid`: disables use of cuserid() at runtime.
* `--disable-ld-preload`: disables building of the Darshan LD_PRELOAD library
* `--disable-bgq-mod`: disables building of the BG/Q module (default checks
and only builds if BG/Q environment detected).
104
* `--enable-group-readable-logs`: sets Darshan log file permissions to allow
105
group read access.
Philip Carns's avatar
Philip Carns committed
106
107
108
109
* `--enable-HDF5-pre-1.10`: enables the Darshan HDF5 instrumentation module,
with support for HDF5 versions prior to 1.10
* `--enable-HDF5-post-1.10`: enables the Darshan HDF5 instrumentation module,
with support for HDF5 versions 1.10 or higher
110
111
112

== Environment preparation

Philip Carns's avatar
Philip Carns committed
113
114
Once darshan-runtime has been installed, you must prepare a location
in which to store the Darshan log files and configure an instrumentation method.
115

116
This step can be safely skipped if you configured darshan-runtime using the
Philip Carns's avatar
Philip Carns committed
117
118
`--with-log-path-by-env` option.  A more typical configuration uses a static
directory hierarchy for Darshan log
119
120
121
122
123
files.

The `darshan-mk-log-dirs.pl` utility will configure the path specified at
configure time to include
subdirectories organized by year, month, and day in which log files will be
Philip Carns's avatar
Philip Carns committed
124
placed. The deepest subdirectories will have sticky permissions to enable
125
126
127
multiple users to write to the same directory.  If the log directory is
shared system-wide across many users then the following script should be run
as root.
128

129
130
131
----
darshan-mk-log-dirs.pl
----
132

133
134
135
136
137
138
139
140
141
.A note about finding log paths after installation
[NOTE]
====
Regardless of whether a Darshan installation is using the --with-log-path or
--with-log-path-by-env option, end users can display the path (and/or
environment variables) at any time by running `darshan-config --log-path`
on the command line.
====

142
143
144
.A note about log directory permissions
[NOTE]
====
145
All log files written by Darshan have permissions set to only allow
146
147
148
149
150
151
152
153
154
155
156
157
158
159
read access by the owner of the file.  You can modify this behavior,
however, by specifying the --enable-group-readable-logs option at
configure time.  One notable deployment scenario would be to configure
Darshan and the log directories to allow all logs to be readable by both the
end user and a Darshan administrators group.   This can be done with the
following steps:

* set the --enable-group-readable-logs option at configure time
* create the log directories with darshan-mk-log-dirs.pl
* recursively set the group ownership of the log directories to the Darshan
administrators group
* recursively set the setgid bit on the log directories
====

160
161
162
163
164
165
166
167
168
169
170
171
172
== Instrumenting applications

[NOTE]
====
More specific installation "recipes" are provided later in this document for
some platforms.  This section of the documentation covers general techniques.
====

Once Darshan has been installed and a log path has been prepared, the next
step is to actually instrument applications. The preferred method is to
instrument applications at compile time.

=== Option 1: Instrumenting MPI applications at compile time
173

174
175
176
177
178
This method is applicable to C, Fortran, and C++ MPI applications
(regardless of whether they are static or dynamicly linked) and is the most
straightforward method to apply transparently system-wide.  It works by
injecting additional libraries and options into the linker command line to
intercept relevant I/O calls.
179

180
181
182
183
184
185
On Cray platforms you can enable the compile time instrumentation by simply
loading the Darshan module.  It can then be enabled for all users by placing
that module in the default environment. As of Darshan 3.2.0 this will
instrument both static and dynamic executables, while in previous versions
of Darshan this was only sufficient for static executables.  See the Cray
installation recipe for more details.
186

187
188
189
For other general MPICH-based MPI implementations, you can generate
Darshan-enabled variants of the standard mpicc/mpicxx/mpif90/mpif77
wrappers using the following commands:
190

191
192
193
194
195
196
----
darshan-gen-cc.pl `which mpicc` --output mpicc.darshan
darshan-gen-cxx.pl `which mpicxx` --output mpicxx.darshan
darshan-gen-fortran.pl `which mpif77` --output mpif77.darshan
darshan-gen-fortran.pl `which mpif90` --output mpif90.darshan
-----
197

198
199
200
201
202
The resulting *.darshan wrappers will transparently inject Darshan
instrumentation into the link step without any explicit user intervention.
They can be renamed and placed in an appropriate
PATH to enable automatic instrumentation.  This method also works correctly
for both static and dynamic executables as of Darshan 3.2.0.
203

204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
For other systems you can enable compile-time instrumentation by either
manually adding the appropriate link options to your command line or
modifying your default MPI compiler script.  The `darshan-config` command
line tool can be used to display the options that you should use:

----
# Linker options to use for dynamic linking (default on most platforms)
#   These arguments should go *before* the MPI libraries in the underlying
#   linker command line to ensure that Darshan can be activated.  It should
#   also ideally go before other libraries that may issue I/O function calls.
darshan-config --dyn-ld-flags

# linker options to use for static linking
#   The first set of arguments should go early in the link command line
#   (before MPI, while the second set should go at the end of the link command
#   line
darshan-config --pre-ld-flags
darshan-config --post-ld-flags
----

==== Using a profile configuration
225
226

The MPICH MPI implementation supports the specification of a profiling library
227
configuration that can be used to insert Darshan instrumentation without
228
modifying the existing MPI compiler script. You can enable a profiling
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
configuration using environment variables or command line arguments to the
compiler scripts:

Example for MPICH 3.1.1 or newer:
----
export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cc
export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx
export MPIFORT_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
----

Examples for command line use:
----
mpicc -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-c <args>
mpicxx -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx <args>
mpif77 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
mpif90 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
----

247
248
249
Note that unlike the previously described methods in this section, this
method *will not* automatically adapt to static and dynamic linking options.
The example profile configurations show above only support dynamic linking.
250

251
252
Example profile configurations are also provided with a "-static" suffix if
you need examples for static linking.
253

254
=== Option 2: Instrumenting MPI applications at run time
255

256
257
258
259
260
This method is applicable to pre-compiled dynamically linked executables
as well as interpreted languages such as Python.  You do not need to
change your compile options in any way.  This method works by injecting
instrumentation at run time.  It will not work for statically linked
executables.
261
262

To use this mechanism, set the `LD_PRELOAD` environment variable to the full
263
path to the Darshan shared library. The preferred method of inserting Darshan
264
instrumentation in this case is to set the `LD_PRELOAD` variable specifically
265
266
267
for the application of interest. Typically this is possible using
command line arguments offered by the `mpirun` or `mpiexec` scripts or by
the job scheduler:
268
269

----
270
271
272
273
274
275
276
mpiexec -n 4 -env LD_PRELOAD /home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

----
srun -n 4 --export=LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

277
For sequential invocations of MPI programs, the following will set LD_PRELOAD for process duration only:
278

279
----
280
281
282
283
284
env LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so mpi-io-test
----

Other environments may have other specific options for controlling this behavior.
Please check your local site documentation for details.
285

286
287
288
289
290
291
292
It is also possible to just export LD_PRELOAD as follows, but it is recommended
against doing that to prevent Darshan and MPI symbols from being pulled into
unrelated binaries:

----
export LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so
----
293

294
295
296
297
298
[NOTE]
For SGI systems running the MPT environment, it may be necessary to set the `MPI_SHEPHERD`
environment variable equal to `true` to avoid deadlock when preloading the Darshan shared
library.

299
=== Option 3: Instrumenting non-MPI applications at run time
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327

Similar to the process described in the previous section, Darshan relies on the
`LD_PRELOAD` mechanism for instrumenting dynamically-linked non-MPI applications.
This allows Darshan to instrument dynamically-linked binaries produced by non-MPI
compilers (e.g., gcc or clang), extending Darshan instrumentation to new contexts
(like instrumentation of arbitrary Python programs or instrumenting serial
file transfer utilities like `cp` and `scp`).

The only additional step required of Darshan non-MPI users is to also set the
DARSHAN_ENABLE_NONMPI environment variable to signal to Darshan that non-MPI
instrumentation is requested:

----
export DARSHAN_ENABLE_NONMPI=1
----

As described in the previous section, it may be desirable to users to limit the
scope of Darshan's instrumentation by only enabling LD_PRELOAD on the target
executable:

----
env LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so io-test
----

[NOTE]
Recall that Darshan instrumentation of non-MPI applications is only possible with 
dynamically-linked applications.

328
329
330
331
332
333
334
335
336
337
338
=== Using other profiling tools at the same time as Darshan

As of Darshan version 3.2.0, Darshan does not necessarily interfere with
other profiling tools (particularly those using the PMPI profiling
interface).  Darshan itself does not use the PMPI interface, and instead
uses dynamic linker symbol interception or --wrap function interception for
static executables.

As a rule of thumb most profiling tools should appear in the linker command
line *before* -ldarshan if possible.

339
340
341
342
343
344
345
346
347
348
349
350
351
352
== Using the Darshan eXtended Tracing (DXT) module

DXT support is disabled by default in Darshan, requiring the user to either explicitly
enable tracing for all files or to provide a trace trigger configuration file describing
which files should be traced at runtime.

To enable tracing globally for all files, Darshan users simply need to set the
DXT_ENABLE_IO_TRACE environment variable as follows:

----
export DXT_ENABLE_IO_TRACE=1
----

To enable tracing for particular files, DXT additionally offers a trace
353
triggering mechanism, with users specifying triggers used to decide whether or
354
355
356
357
358
359
360
361
362
363
not to trace a particular file at runtime. Files that do not match any trace
trigger will not store trace data in the Darshan log. Currently, DXT supports
the following types of trace triggers:

* file triggers: trace files based on regex matching of file paths
* rank triggers: trace files based on regex matching of ranks
* dynamic triggers: trace files based on runtime analysis of I/O characteristics (e.g., frequent small or unaligned I/O accesses)

Users simply need to specify one or more of these triggers in a text file that is passed
to DXT at runtime -- when multiple triggers are specified, DXT will keep any file traces
364
that match at least one trigger (i.e., the trace decision is a logical OR across given triggers).
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
An example configuration file is given below, illustrating the syntax to use for currently
supported triggers:

----
FILE .h5$           # trace all files with a '.h5' extension
FILE ^/tmp          # trace all files with a path prefix of '/tmp'
RANK [1-2]          # trace all files accessed by ranks 1-2
SMALL_IO .5         # trace all files with greater than 50% small (less than 10 KB) accesses
UNALIGNED_IO .5     # trace all files with greater than 50% unaligned accesses
----

FILE and RANK triggers take a single parameter representing the regex that will be compared
to the file name and the rank accessing it, respectively. Regex support is provided by the
POSIX `regex.h` interface -- refer to the https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/regex.h.html[manpage] for more details on regex syntax.
SMALL_IO and UNALIGNED_IO triggers take a single parameter representing the lower
threshold percentage of accesses of the given type.

Set the DXT_TRIGGER_CONF_PATH environment variable to notify DXT of the path of the
configuration file:

----
export DXT_TRIGGER_CONF_PATH=/path/to/dxt/config/file
----

389
390
== Darshan installation recipes

Philip Carns's avatar
Philip Carns committed
391
The following recipes provide examples for prominent HPC systems.
Philip Carns's avatar
Philip Carns committed
392
These are intended to be used as a starting point.  You will most likely have to adjust paths and options to
393
394
reflect the specifics of your system.

395
=== Cray platforms (XE, XC, or similar)
396

397
This section describes how to compile and install Darshan,
398
as well as how to use a software module to enable and disable Darshan
399
instrumentation on Cray systems.
400
401
402
403
404

==== Building and installing Darshan

Please set your environment to use the GNU programming environment before
configuring or compiling Darshan.  Although Darshan can be built with a
405
variety of compilers, the GNU compiler is recommended because it will
406
produce a Darshan library that is interoperable with the widest range
407
of compilers and linkers.  On most Cray systems you can enable the GNU
408
programming environment with a command similar to "module swap PrgEnv-intel
409
410
PrgEnv-gnu".  Please see your site documentation for information about
how to switch programming environments.
411
412

The following example shows how to configure and build Darshan on a Cray
413
414
system using the GNU programming environment.  Adjust the
--with-log-path and --prefix arguments to point to the desired log file path
415
and installation path, respectively.
416
417

----
418
module swap PrgEnv-pgi PrgEnv-gnu
419
./configure \
420
421
 --with-log-path=/shared-file-system/darshan-logs \
 --prefix=/soft/darshan-2.2.3 \
422
 --with-jobid-env=PBS_JOBID --disable-cuserid CC=cc
423
424
make install
module swap PrgEnv-gnu PrgEnv-pgi
425
426
427
428
429
430
431
----

.Rationale
[NOTE]
====
The job ID is set to `PBS_JOBID` for use with a Torque or PBS based scheduler.
The `CC` variable is configured to point the standard MPI compiler.
Philip Carns's avatar
Philip Carns committed
432
433
434
435
436
437
438
439

The --disable-cuserid argument is used to prevent Darshan from attempting to
use the cuserid() function to retrieve the user name associated with a job.
Darshan automatically falls back to other methods if this function fails,
but on some Cray environments (notably the Beagle XE6 system as of March 2012)
the cuserid() call triggers a segmentation fault.  With this option set,
Darshan will typically use the LOGNAME environment variable to determine a
userid.
440
441
====

442
443
As in any Darshan installation, the darshan-mk-log-dirs.pl script can then be
used to create the appropriate directory hierarchy for storing Darshan log
444
files in the --with-log-path directory.
Philip Carns's avatar
Philip Carns committed
445

446
447
448
449
450
Note that Darshan is not currently capable of detecting the stripe size
(and therefore the Darshan FILE_ALIGNMENT value) on Lustre file systems.
If a Lustre file system is detected, then Darshan assumes an optimal
file alignment of 1 MiB.

451
==== Enabling Darshan instrumentation
452

453
454
Darshan will automatically install example software module files in the
following locations (depending on how you specified the --prefix option in
455
the previous section):
Philip Carns's avatar
Philip Carns committed
456

457
----
458
/soft/darshan-2.2.3/share/craype-1.x/modulefiles/darshan
459
/soft/darshan-2.2.3/share/craype-2.x/modulefiles/darshan
460
----
Philip Carns's avatar
Philip Carns committed
461

462
463
Select the one that is appropriate for your Cray programming environment
(see the version number of the craype module in `module list`).
Philip Carns's avatar
Philip Carns committed
464

465
466
467
468
469
470
471
472
473
474
475
476
477
If you are using the Cray Programming Environment version 1.x, then you
must modify the corresponding modulefile before using it.  Please see
the comments at the end of the file and choose an environment variable
method that is appropriate for your system.  If this is not done, then
the compiler may fail to link some applications when the Darshan module
is loaded.

If you are using the Cray Programming Environment version 2.x then you can
likely use the modulefile as is.  Note that it pulls most of its
configuration from the lib/pkgconfig/darshan-runtime.pc file installed with
Darshan.

The modulefile that you select can be copied to a system location, or the
478
479
install location can be added to your local module path with the following
command:
Philip Carns's avatar
Philip Carns committed
480

481
----
482
module use /soft/darshan-2.2.3/share/craype-<VERSION>/modulefiles
483
----
Philip Carns's avatar
Philip Carns committed
484

485
From this point, Darshan instrumentation can be enabled for all future
486
application compilations by running "module load darshan".
487

488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
=== Linux clusters using MPICH

Most MPICH installations produce dynamic executables by default.  To
configure Darshan in this environment you can use the following example.  We
recommend using mpicc with GNU compilers to compile Darshan.

----
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
----

The darshan-gen-* scripts described earlier in this document can be used
to create variants of the standard mpicc/mpicxx/mpif77/mpif90 scipts
that are Darshan enabled.  These scripts will work correctly for both
dynamic and statically linked executables.

=== Linux clusters using Intel MPI
504
505
506
507
508

Most Intel MPI installations produce dynamic executables by default.  To
configure Darshan in this environment you can use the following example:

----
509
./configure --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
510
511
512
513
514
515
516
517
518
519
----

.Rationale
[NOTE]
====
There is nothing unusual in this configuration except that you should use
the underlying GNU compilers rather than the Intel ICC compilers to compile
Darshan itself.
====

520
521
You can enable Darshan instrumentation at compile time by adding
`darshan-config --dyn-ld-flags` options to your linker command line.
522

523
524
Alternatively you can use `LD_PRELOAD` runtime instrumentation method to
instrument executables that have already been compiled.
525

526
527
528
529
530
531
=== Linux clusters using Open MPI

Follow the generic instructions provided at the top of this document for
compilation, and make sure that the `CC` used for compilation is based on a
GNU compiler.

532
533
You can enable Darshan instrumentation at compile time by adding
`darshan-config --dyn-ld-flags` options to your linker command line.
534

535
536
Alternatively you can use `LD_PRELOAD` runtime instrumentation method to
instrument executables that have already been compiled.
537

538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
== Upgrading to Darshan 3.x from 2.x

Beginning with Darshan 3.0.0, Darshan has been rewritten to modularize its runtime environment
and log file format to simplify the addition of new I/O characterization data. The process of
compiling and installing the Darshan 3.x source code should essentially be identical to this
process on Darshan 2.x. Therefore, the installation recipes given in the previous section
should work irrespective of the Darshan version being used. Similarly, the manner in which
Darshan is used should be the same across versions -- the sections in this document regarding
Darshan link:darshan-runtime.html#_environment_preparation[environment preparation],
instrumenting link:darshan-runtime.html#_instrumenting_statically_linked_applications[statically
linked applications] and link:darshan-runtime.html#_instrumenting_dynamically_linked_applications[
dynamically linked applications], and using link:darshan-runtime.html#_runtime_environment_variables[
runtime environment variables] are equally applicable to both versions.

However, we do provide some suggestions and expectations for system administrators to keep in
mind when upgrading to Darshan 3.x:

555
556
557
558
559
560
* Darshan 2.x and Darshan 3.x produce incompatible log file formats
    ** log files named *.darshan.gz or *.darshan.bz2: Darshan 2.x format
    ** log files named *.darshan: Darshan 3.x format
        *** a field in the log file header indicates underlying compression method in 3.x logs
    ** There is currently no tool for converting 2.x logs into the 3.x log format.
    ** The `darshan-logutils` library will provide error messages to indicate whether a given
Shane Snyder's avatar
Shane Snyder committed
561
log file is incompatible with the correspnonding library version. 
562
563
564

* We encourage administrators to use the same log file directory for version 3.x as had been
used for version 2.x.
565
    ** Within this directory, the determination on which set of log utilities (version 2.x
566
or version 3.x) to use can be based on the file extension for a given log (as explained
Shane Snyder's avatar
Shane Snyder committed
567
above).
568

569
570
571
572
573
574
575
576
== Runtime environment variables

The Darshan library honors the following environment variables to modify
behavior at runtime:

* DARSHAN_DISABLE: disables Darshan instrumentation
* DARSHAN_INTERNAL_TIMING: enables internal instrumentation that will print the time required to startup and shutdown Darshan to stderr at run time.
* DARSHAN_LOGHINTS: specifies the MPI-IO hints to use when storing the Darshan output file.  The format is a semicolon-delimited list of key=value pairs, for example: hint1=value1;hint2=value2
577
* DARSHAN_MEMALIGN: specifies a value for system memory alignment
578
* DARSHAN_JOBID: specifies the name of the environment variable to use for the job identifier, such as PBS_JOBID
579
* DARSHAN_DISABLE_SHARED_REDUCTION: disables the step in Darshan aggregation in which files that were accessed by all ranks are collapsed into a single cumulative file record at rank 0.  This option retains more per-process information at the expense of creating larger log files. Note that it is up to individual instrumentation module implementations whether this environment variable is actually honored.
580
* DARSHAN_LOGPATH: specifies the path to write Darshan log files to. Note that this directory needs to be formatted using the darshan-mk-log-dirs script.
581
* DARSHAN_LOGFILE: specifies the path (directory + Darshan log file name) to write the output Darshan log to. This overrides the default Darshan behavior of automatically generating a log file name and adding it to a log file directory formatted using darshan-mk-log-dirs script.
582
* DARSHAN_MODMEM: specifies the maximum amount of memory (in MiB) Darshan instrumentation modules can collectively consume at runtime (if not specified, Darshan uses a default quota of 2 MiB).
583
* DARSHAN_MMAP_LOGPATH: if Darshan's mmap log file mechanism is enabled, this variable specifies what path the mmap log files should be stored in (if not specified, log files will be stored in `/tmp`).
584
* DARSHAN_EXCLUDE_DIRS: specifies a list of comma-separated paths that Darshan will not instrument at runtime (in addition to Darshan's default blacklist)
585
586
587
* DXT_ENABLE_IO_TRACE: setting this environment variable enables the DXT (Darshan eXtended Tracing) modules at runtime for all files instrumented by Darshan. Currently, DXT is hard-coded to use a maximum of 4 MiB of trace memory per process (in addition to memory used by other modules).
* DXT_DISABLE_IO_TRACE: setting this environment variable disables the DXT module at runtime for all files instrumented by Darshan.
* DXT_TRIGGER_CONF_PATH: File path to a DXT trace trigger configuration file, which specifies triggers used by DXT to decide which files to trace at runtime. Note that the trace triggering mechanism is overridden by the DXT_ENABLE_IO_TRACE and DXT_DISABLE_IO_TRACE environment variables.
588
* DARSHAN_ENABLE_NONMPI: setting this environment variable is required to generate Darshan logs for non-MPI applications
589
590
591
592
593
594
595

== Debugging

=== No log file

In cases where Darshan is not generating a log file for an application, some common things to check are:

596
597
598
* Make sure you are looking in the correct place for logs.  Confirm the
  location with the `darshan-config --log-path` command.

599
600
601
602
603
604
605
606
607
608
609
610
611
612
* Check stderr to ensure Darshan isn't indicating any internal errors (e.g., invalid log file path)

For statically linked executables:

* Ensure that Darshan symbols are present in the underlying executable by running `nm` on it:
----
> nm test | grep darshan
0000000000772260 b darshan_core
0000000000404440 t darshan_core_cleanup
00000000004049b0 T darshan_core_initialize
000000000076b660 d darshan_core_mutex
00000000004070a0 T darshan_core_register_module
----

613
614
615
616
For dynamically linked executables:

* Ensure that the Darshan library is present in the list of shared libraries
  to be used by the application, and that it appears before the MPI library:
617
----
618
619
620
621
622
623
> ldd mpi-io-test
	linux-vdso.so.1 (0x00007ffd83925000)
	libdarshan.so => /home/carns/working/install/lib/libdarshan.so (0x00007f0f4a7a6000)
	libmpi.so.12 => /home/carns/working/src/spack/opt/spack/linux-ubuntu19.10-skylake/gcc-9.2.1/mpich-3.3.2-h3dybprufq7i5kt4hcyfoyihnrnbaogk/lib/libmpi.so.12 (0x00007f0f4a44f000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0f4a241000)
        ...
624
625
----

626
627
General:

628
629
630
631
632
633
634
635
636
637
* Ensure that the linker is correctly linking in Darshan's runtime libraries:
    ** A common mistake is to explicitly link in the underlying MPI libraries (e.g., `-lmpich` or `-lmpichf90`)
in the link command, which can interfere with Darshan's instrumentation
        *** These libraries are usually linked in automatically by the compiler
        *** MPICH's `mpicc` comipler's `-show` flag can be used to examine the invoked link command, for instance
    ** The linker's `-y` option can be used to verify that Darshan is properly intercepting MPI_Init
function (e.g. by setting `CFLAGS='-Wl,-yMPI_Init'`), which it uses to initialize its runtime structures
----
/usr/common/software/darshan/3.0.0-pre3/lib/libdarshan.a(darshan-core-init-finalize.o): definition of MPI_Init
----