darshan-runtime.txt 17.6 KB
Newer Older
1 2 3 4 5
Darshan-runtime installation and usage
======================================

== Introduction

6 7 8
This document describes darshan-runtime, which is the instrumentation
portion of the Darshan characterization tool.  It should be installed on the
system where you intend to collect I/O characterization information.
Philip Carns's avatar
Philip Carns committed
9

10 11 12
Darshan instruments applications via either compile time wrappers for static
executables or dynamic library preloading for dynamic executables.  An
application that has been instrumented with Darshan will produce a single
Kevin Harms's avatar
Kevin Harms committed
13
log file each time it is executed.  This log summarizes the I/O access patterns
14 15 16 17 18 19 20 21 22 23
used by the application.

The darshan-runtime instrumentation only instruments MPI applications (the
application must at least call `MPI_Init()` and `MPI_Finalize()`).  However,
it captures both MPI-IO and POSIX file access.  It also captures limited
information about HDF5 and PnetCDF access.

This document provides generic installation instructions, but "recipes" for
several common HPC systems are provided at the end of the document as well.

24 25 26
More information about Darshan can be found at the 
http://www.mcs.anl.gov/darshan[Darshan web site].

27 28 29 30 31
== Requirements

* MPI C compiler
* zlib development headers and library

Philip Carns's avatar
Philip Carns committed
32
== Compilation
33 34 35 36 37 38 39 40 41 42

.Configure and build example
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-runtime
./configure --with-mem-align=8 --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
make
make install
----

43 44 45 46 47 48 49 50 51 52 53 54 55 56
.Detecting file size and alignment
[NOTE]
====
You can also add --enable-stat-at-open option to cause the Darshan library
to issue an additional stat() system call on each file the first time that
it is opened on each process.  This allows Darshan to detect the file
alignment (and subsequent unaligned accesses).  It also allows Darshan to
detect the size of files at open time before any I/O is performed.
Unfortunately, this option can cause significant overhead at scale on file
systems such as PVFS or Lustre that must contact every server for a given
file in order to satisfy a stat request.  We therefore disable this
feature by default.
====

57 58 59 60 61 62 63 64 65 66 67 68 69 70
.Explanation of configure arguments:
* `--with-mem-align` (mandatory): This value is system-dependent and will be
used by Darshan to determine if the buffer for a read or write operation is
aligned in memory.
* `--with-log-path` (this, or `--with-log-path-by-env`, is mandatory): This
specifies the parent directory for the directory tree where darshan logs
will be placed
* `--with-jobid-env` (mandatory): this specifies the environment variable that
Darshan should check to determine the jobid of a job.  Common values are
`PBS_JOBID` or `COBALT_JOBID`.  If you are not using a scheduler (or your
scheduler does not advertise the job ID) then you can specify `NONE` here.
Darshan will fall back to using the pid of the rank 0 process if the
specified environment variable is not set.
* `CC=`: specifies the MPI C compiler to use for compilation
Philip Carns's avatar
Philip Carns committed
71
* `--with-log-path-by-env`: specifies an environment variable to use to
72 73 74 75
determine the log path at run time.
* `--with-log-hints=`: specifies hints to use when writing the Darshan log
file.  See `./configure --help` for details.
* `--with-zlib=`: specifies an alternate location for the zlib development
Philip Carns's avatar
Philip Carns committed
76
header and library.
77 78 79

=== Cross compilation

Philip Carns's avatar
Philip Carns committed
80
On some systems (notably the IBM Blue Gene series), the login nodes do not
81 82
have the same architecture or runtime environment as the compute nodes.  In
this case, you must configure darshan-runtime to be built using a cross
Philip Carns's avatar
Philip Carns committed
83
compiler.  The following configure arguments show an example for the BG/P system:
84 85 86 87 88 89 90

----
--host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
----

== Environment preparation

Philip Carns's avatar
Philip Carns committed
91 92
Once darshan-runtime has been installed, you must prepare a location
in which to store the Darshan log files and configure an instrumentation method.
93 94 95

=== Log directory

96
This step can be safely skipped if you configured darshan-runtime using the
Philip Carns's avatar
Philip Carns committed
97 98
`--with-log-path-by-env` option.  A more typical configuration uses a static
directory hierarchy for Darshan log
99 100 101 102 103
files.

The `darshan-mk-log-dirs.pl` utility will configure the path specified at
configure time to include
subdirectories organized by year, month, and day in which log files will be
Philip Carns's avatar
Philip Carns committed
104
placed. The deepest subdirectories will have sticky permissions to enable
105 106 107 108 109 110 111
multiple users to write to the same directory.  If the log directory is
shared system-wide across many users then the following script should be run
as root.
 
----
darshan-mk-log-dirs.pl
----
112

113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
.A note about log directory permissions
[NOTE]
====
All log files written by darshan have permissions set to only allow
read access by the owner of the file.  You can modify this behavior,
however, by specifying the --enable-group-readable-logs option at
configure time.  One notable deployment scenario would be to configure
Darshan and the log directories to allow all logs to be readable by both the
end user and a Darshan administrators group.   This can be done with the
following steps:

* set the --enable-group-readable-logs option at configure time
* create the log directories with darshan-mk-log-dirs.pl
* recursively set the group ownership of the log directories to the Darshan
administrators group
* recursively set the setgid bit on the log directories
====


132
=== Instrumentation method
133

134 135 136 137 138 139 140 141 142
The instrumentation method to use depends on whether the executables
produced by your MPI compiler are statically or dynamically linked.  If you
are unsure, you can check by running `ldd <executable_name>` on an example
executable.  Dynamically-linked executables will produce a list of shared
libraries when this command is executed.

Most MPI compilers allow you to toggle dynamic or static linking via options
such as `-dynamic` or `-static`.  Please check your MPI compiler man page
for details if you intend to force one mode or the other.
143 144 145

== Instrumenting statically-linked applications

146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
Statically linked executables must be instrumented at compile time.  The
simplest way to do this is to generate an MPI compiler script (e.g. `mpicc`)
that includes the link options and libraries needed by Darshan.  Once this
is done, Darshan instrumentation is transparent; you simply compile
applications using the darshan-enabled MPI compiler scripts.

For MPICH-based MPI libraries, such as MPICH1, MPICH2, or MVAPICH, these
wrapper scripts can be generated automatically.  The following example
illustrates how to produce wrappers for C, C++, and Fortran compilers:

----
darshan-gen-cc.pl `which mpicc` --output mpicc.darshan
darshan-gen-cxx.pl `which mpicxx` --output mpicxx.darshan
darshan-gen-fortran.pl `which mpif77` --output mpif77.darshan
darshan-gen-fortran.pl `which mpif90` --output mpif90.darshan
-----

Philip Carns's avatar
Philip Carns committed
163 164
Please see the Cray recipe in this document for instructions on adding
Darshan support to the Cray compiler scripts.  
165 166 167 168
For other MPI Libraries you must manually modify the MPI compiler scripts to
add the necessary link options and libraries.  Please see the
`darshan-gen-*` scripts for examples or contact the Darshan users mailing
list for help.
169 170 171

== Instrumenting dynamically-linked applications

172
For dynamically-linked executables, darshan relies on the `LD_PRELOAD`
Philip Carns's avatar
Philip Carns committed
173 174
environment variable to insert instrumentation at run time.  The executables
should be compiled using the normal, unmodified MPI compiler.
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192

To use this mechanism, set the `LD_PRELOAD` environment variable to the full
path to the Darshan shared library, as in this example:

----
export LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so
----

You can then run your application as usual.  Some environments may require a
special `mpirun` or `mpiexec` command line argument to propagate the
environment variable to all processes.  Other environments may require a
scheduler submission option to control this behavior.  Please check your
local site documentation for details.

=== Instrumenting dynamically-linked Fortran applications

Please follow the general steps outlined in the previous section.  For
Fortran applications compiled with MPICH you may have to take the additional
Philip Carns's avatar
Philip Carns committed
193
step of adding
194 195 196 197 198
`libfmpich.so` to your `LD_PRELOAD` environment variable. For example:

----
export LD_PRELOAD=libfmpich.so:/home/carns/darshan-install/lib/libdarshan.so
----
199 200 201

== Darshan installation recipes

Philip Carns's avatar
Philip Carns committed
202
The following recipes provide examples for prominent HPC systems.
Philip Carns's avatar
Philip Carns committed
203
These are intended to be used as a starting point.  You will most likely have to adjust paths and options to
204 205 206 207 208 209 210 211 212 213 214 215 216 217
reflect the specifics of your system.

=== IBM Blue Gene/P

The IBM Blue Gene/P series produces static executables by default, uses a
different architecture for login and compute nodes, and uses an MPI
environment based on MPICH.

The following example shows how to configure Darshan on a BG/P system:

----
./configure --with-mem-align=16 \
 --with-log-path=/home/carns/working/darshan/releases/logs \
 --prefix=/home/carns/working/darshan/install --with-jobid-env=COBALT_JOBID \
218
 --with-zlib=/soft/apps/zlib-1.2.3/ \
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
 --host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
----

.Rationale
[NOTE]
====
The memory alignment is set to 16 not because that is the proper alignment
for the BG/P CPU architecture, but because that is the optimal alignment for
the network transport used between compute nodes and I/O nodes in the
system.  The jobid environment variable is set to `COBALT_JOBID` in this
case for use with the Cobalt scheduler, but other BG/P systems may use
different schedulers.  The `--with-zlib` argument is used to point to a
version of zlib that has been compiled for use on the compute nodes rather
than the login node.  The `--host` argument is used to force cross-compilation
of Darshan.  The `CC` variable is set to point to a stock MPI compiler.
====

Once Darshan has been installed, use the `darshan-gen-*.pl` scripts as
described earlier in this document to produce darshan-enabled MPI compilers.
This method has been widely used and tested with both the GNU and IBM XL
compilers.

241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265
=== Cray XE6 (or similar)

The Cray programming environment produces static executables by default.
Darshan should therefore be configured to insert instrumentation at link
time by way of compiler script wrappers.  Darshan 2.2.3 supports GNU,
PGI, Cray, Pathscale, and Intel compilers.  The following documentation
describes how to modify the Cray compiler wrappers to add Darshan
capability, as well as how to install a Darshan software module that
allows users to enable or disable Darshan instrumentation.

==== Building and installing Darshan

Please set your environment to use the GNU programming environment before
configuring or compiling Darshan.  Although Darshan can be built with a
variety of compilers, the GNU compilers are recommended because it will
produce a Darshan library that is interoperable with a variety of linkers.
On most Cray systems you can enable the GNU programming environment with
a command similar to "module swap PrgEnv-pgi PrgEnv-gnu".  Please see
your site documentation for information about how to switch programming
environments.

The following example shows how to configure and build Darshan on a Cray
system using either the GNU programming environment.  Please adjust the 
--with-log-path and --prefix arguments to point to the desired log file path 
and installation path, respectively.
266 267

----
268
module swap PrgEnv-pgi PrgEnv-gnu
269
./configure --with-mem-align=8 \
270 271
 --with-log-path=/shared-file-system/darshan-logs \
 --prefix=/soft/darshan-2.2.3 \
272
 --with-jobid-env=PBS_JOBID --disable-cuserid CC=cc
273 274
make install
module swap PrgEnv-gnu PrgEnv-pgi
275 276 277 278 279 280 281
----

.Rationale
[NOTE]
====
The job ID is set to `PBS_JOBID` for use with a Torque or PBS based scheduler.
The `CC` variable is configured to point the standard MPI compiler.
Philip Carns's avatar
Philip Carns committed
282 283 284 285 286 287 288 289

The --disable-cuserid argument is used to prevent Darshan from attempting to
use the cuserid() function to retrieve the user name associated with a job.
Darshan automatically falls back to other methods if this function fails,
but on some Cray environments (notably the Beagle XE6 system as of March 2012)
the cuserid() call triggers a segmentation fault.  With this option set,
Darshan will typically use the LOGNAME environment variable to determine a
userid.
290 291
====

292 293 294
As in any Darshan installation, the darshan-mk-log-dirs.pl script can then be 
used to create the appropriate directory hierarchy for storing Darshan log 
files in the --with-log-path directory.
Philip Carns's avatar
Philip Carns committed
295

296
==== Compiler wrappers (system-wide installation)
Philip Carns's avatar
Philip Carns committed
297

298 299 300 301 302
.Warning
[NOTE]
====
The instructions in this section will modify the default compiler scripts
that underly the system-wide CC, ftn, and cc scripts.  Please proceed
Philip Carns's avatar
Philip Carns committed
303 304
with caution.  We recommend performing the following steps
on a copy of the scripts and then moving them to the correct location afterwards.
305
====
Philip Carns's avatar
Philip Carns committed
306

307 308 309 310 311 312 313 314
The Darshan distribution includes a patch to the Cray programming environment 
that adds Darshan capability to the compiler scripts.  It does not modify the
behavior of the scripts in any way unless the CRAY_DARSHAN_DIR environment 
variable is set by the Darshan software module.  This approach is similar to
that used by the HDF5, NetCDF, and PETSc packages on Cray XE6 systems.  Darshan
requires compiler script modifications in order to apply instrumentation
libraries and link options in the correct order relative to other libraries used
at link time.
Philip Carns's avatar
Philip Carns committed
315

316 317 318 319 320 321 322 323
Two patches are provided in the Darshan source tree:

* cray-xt-asyncpe-5.10-darshan.patch: for xt-asyncpe versions 5.10 or 5.11
* cray-xt-asyncpe-5.12-darshan.patch: for xt-asyncpe versions 5.12 or higher

Perform the following steps to modify the system compiler scripts 
after selecting the appropriate patch.  This example assumes the use of the
patch for xt-asyncpe version 5.12 or higher.
Philip Carns's avatar
Philip Carns committed
324

325 326
----
cd $ASYNCPE_DIR/bin
327
patch -p1 --dry-run < /home/carns/darshan-2.2.3/darshan-runtime/share/cray/cray-xt-asyncpe-5.12-darshan.patch
328
# CONFIRM THE RESULTS OF THE DRY RUN SHOWN ABOVE
329
patch -p1 < /home/carns/darshan-2.2.3/darshan-runtime/share/cray/cray-xt-asyncpe-5.12-darshan.patch
330
----
Philip Carns's avatar
Philip Carns committed
331

332
The next step is to install the Darshan software module.  Note that the module
Philip Carns's avatar
Philip Carns committed
333
file will be found in the Darshan installation directory, as it is generated 
334
automatically based on configuration parameters:
Philip Carns's avatar
Philip Carns committed
335

336 337 338 339 340
----
cp -r /soft/darshan-2.2.3/share/cray/modulefiles/darshan /opt/modulefiles/
----

Users (or administrators) can now enable or disable darshan instrumentation by 
Philip Carns's avatar
Philip Carns committed
341 342 343 344 345
loading or unloading the "darshan" module.  Note that the module file also
includes commented-out examples that may be useful for deploying Darshan in
different configurations, including how to use LD_PRELOAD for dynamically
linked executables and how to use Darshan with a different set of compiler
scripts than those found in the normal system path.
346 347

==== Compiler wrappers (user installation)
Philip Carns's avatar
Philip Carns committed
348

349 350 351
In order to install Darshan for a single user in a Cray environment, the steps
are similar to those described above, except that the compiler script
modifications are applied to a local copy of the compiler scripts and the 
352 353 354
Darshan module is added locally rather than globally.  Note that the 
cray-xt-asyncpe-5.12-darshan.patch is intended for use with xt-asyncpe versions
5.12 or higher.  Please use cray-xt-asyncpe-5.12-darshan.patch for 5.10 or 5.11.
Philip Carns's avatar
Philip Carns committed
355

356 357 358
----
mkdir xt-asyncpe-darshan
cp -r $ASYNCPE_DIR/* xt-asyncpe-darshan
Philip Carns's avatar
Philip Carns committed
359
cd xt-asyncpe-darshan/bin
360
patch -p1 --dry-run < /home/carns/darshan-2.2.3/darshan-runtime/share/cray/cray-xt-asyncpe-5.12-darshan.patch
361
# CONFIRM THE RESULTS OF THE DRY RUN SHOWN ABOVE
362
patch -p1 < /home/carns/darshan-2.2.3/darshan-runtime/share/cray/cray-xt-asyncpe-5.12-darshan.patch
363 364 365

module use /soft/darshan-2.2.3/share/cray/modulefiles/
----
Philip Carns's avatar
Philip Carns committed
366

367 368 369
In addition to loading the "darshan" software module, in order to use the 
modified compiler script you must also set the following environment
variables:
Philip Carns's avatar
Philip Carns committed
370

371 372 373 374
----
setenv ASYNCPE_DIR /home/carns/xt-asyncpe-darshan
setenv PATH "/home/carns/xt-asyncpe-darshan/bin:$PATH"
----
375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406

=== Linux clusters using Intel MPI 

Most Intel MPI installations produce dynamic executables by default.  To
configure Darshan in this environment you can use the following example:

----
./configure --with-mem-align=8 --with-log-path=/darshan-logs --with-jobid-env=PBS_JOBID CC=mpicc
----

.Rationale
[NOTE]
====
There is nothing unusual in this configuration except that you should use
the underlying GNU compilers rather than the Intel ICC compilers to compile
Darshan itself.
====

You can use the `LD_PRELOAD` method described earlier in this document to
instrument executables compiled with the Intel MPI compiler scripts.  This
method has been briefly tested using both GNU and Intel compilers.

.Caveat
[NOTE]
====
Darshan is only known to work with C and C++ executables generated by the
Intel MPI suite.  Darshan will not produce instrumentation for Fortran
executables.  For more details please check this Intel forum discussion:

http://software.intel.com/en-us/forums/showthread.php?t=103447&o=a&s=lr
====

407
=== Linux clusters using MPICH 
408 409 410 411 412 413

Follow the generic instructions provided at the top of this document.  The
only modification is to make sure that the `CC` used for compilation is
based on a GNU compiler.  Once Darshan has been installed, it should be
capable of instrumenting executables built with GNU, Intel, and PGI
compilers.
Philip Carns's avatar
Philip Carns committed
414

415 416 417 418 419 420 421 422 423 424 425 426 427
=== Linux clusters using Open MPI

Follow the generic instructions provided at the top of this document for
compilation, and make sure that the `CC` used for compilation is based on a
GNU compiler.

Open MPI typically produces dynamically linked executables by default, which
means that you should use the `LD_PRELOAD` method to instrument executables
that have been built with Open MPI.  Darshan is only compatible with Open
MPI 1.6.4 and newer.  For more details on why Darshan is not compatible with
older versions of Open MPI, please refer to the following mailing list discussion:

http://www.open-mpi.org/community/lists/devel/2013/01/11907.php