Commit 31e0fb03 authored by Shane Snyder's avatar Shane Snyder

update runtime, util, and modularization docs

parent 1c1a9baa
...@@ -16,7 +16,10 @@ used by the application. ...@@ -16,7 +16,10 @@ used by the application.
The darshan-runtime instrumentation only instruments MPI applications (the The darshan-runtime instrumentation only instruments MPI applications (the
application must at least call `MPI_Init()` and `MPI_Finalize()`). However, application must at least call `MPI_Init()` and `MPI_Finalize()`). However,
it captures both MPI-IO and POSIX file access. It also captures limited it captures both MPI-IO and POSIX file access. It also captures limited
information about HDF5 and PnetCDF access. information about HDF5 and PnetCDF access. Darshan also exposes an API that
can be used to develop and add new instrumentation modules (for other I/O library
interfaces or to gather system-specific data, for instance), as detailed in
http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-modularization.html[this document].
This document provides generic installation instructions, but "recipes" for This document provides generic installation instructions, but "recipes" for
several common HPC systems are provided at the end of the document as well. several common HPC systems are provided at the end of the document as well.
...@@ -311,7 +314,7 @@ Please set your environment to use the GNU programming environment before ...@@ -311,7 +314,7 @@ Please set your environment to use the GNU programming environment before
configuring or compiling Darshan. Although Darshan can be built with a configuring or compiling Darshan. Although Darshan can be built with a
variety of compilers, the GNU compilers are recommended because it will variety of compilers, the GNU compilers are recommended because it will
produce a Darshan library that is interoperable with the widest range produce a Darshan library that is interoperable with the widest range
of compmilers and linkers. On most Cray systems you can enable the GNU of compilers and linkers. On most Cray systems you can enable the GNU
programming environment with a command similar to "module swap PrgEnv-pgi programming environment with a command similar to "module swap PrgEnv-pgi
PrgEnv-gnu". Please see your site documentation for information about PrgEnv-gnu". Please see your site documentation for information about
how to switch programming environments. how to switch programming environments.
......
...@@ -66,12 +66,16 @@ application will likely be found in a centralized directory, with the path ...@@ -66,12 +66,16 @@ application will likely be found in a centralized directory, with the path
and log file name in the following format: and log file name in the following format:
---- ----
<YEAR>/<MONTH>/<DAY>/<USERNAME>_<BINARY_NAME>_<JOB_ID>_<DATE>_<UNIQUE_ID>_<TIMING>.darshan.gz <YEAR>/<MONTH>/<DAY>/<USERNAME>_<BINARY_NAME>_<JOB_ID>_<DATE>_<UNIQUE_ID>_<TIMING>.darshan
---- ----
This is a binary format file that summarizes I/O activity. As of version This is a binary format file that summarizes I/O activity. As of version
2.0.0 of Darshan, this file is portable and does not have to be analyzed on 2.0.0 of Darshan, this file is portable and does not have to be analyzed on
the same system that executed the job. the same system that executed the job. Also, note that Darshan logs generated
with Darshan versions preceding version 3.0 will have the extension `darshan.gz`
(or `darshan.bz2` if compressed using bzip2 format). These logs are not compatible
with Darshan 3.0 utilities, and thus must be analyzed using an appropriate version
(2.x) of the darshan-util package.
=== darshan-job-summary.pl === darshan-job-summary.pl
...@@ -462,9 +466,9 @@ Byte and for the aggregate performance is MiB/s (1024*1024 Bytes/s). ...@@ -462,9 +466,9 @@ Byte and for the aggregate performance is MiB/s (1024*1024 Bytes/s).
===== Files ===== Files
Use the `--file` option to get totals based on file usage. Use the `--file` option to get totals based on file usage.
The first column is the count of files for that type, the second column is Each line has 3 columns. The first column is the count of files for that
number of bytes for that type and the third column is the maximum offset type of file, the second column is number of bytes for that type, and the third
accessed. column is the maximum offset accessed.
* total: All files * total: All files
* read_only: Files that were only read from * read_only: Files that were only read from
...@@ -473,9 +477,6 @@ accessed. ...@@ -473,9 +477,6 @@ accessed.
* unique: Files that were opened on only one rank * unique: Files that were opened on only one rank
* shared: File that were opened by more than one rank * shared: File that were opened by more than one rank
Each line has 3 columns. The first column is the count of files for that
type of file, the second column is number of bytes for that type, and the third
column is the maximum offset accessed.
.Example output .Example output
---- ----
...@@ -557,8 +558,6 @@ If the `--bzip2` flag is given, then the output file will be re-compressed in ...@@ -557,8 +558,6 @@ If the `--bzip2` flag is given, then the output file will be re-compressed in
bzip2 format rather than libz format. It also has command line options for bzip2 format rather than libz format. It also has command line options for
anonymizing personal data, adding metadata annotation to the log header, and anonymizing personal data, adding metadata annotation to the log header, and
restricting the output to a specific instrumented file. restricting the output to a specific instrumented file.
* darshan-diff: compares two darshan log files and shows counters that
differ.
* darshan-analyzer: walks an entire directory tree of Darshan log files and * darshan-analyzer: walks an entire directory tree of Darshan log files and
produces a summary of the types of access methods used in those log files. produces a summary of the types of access methods used in those log files.
* darshan-logutils*: this is a library rather than an executable, but it * darshan-logutils*: this is a library rather than an executable, but it
......
...@@ -32,8 +32,8 @@ cd darshan ...@@ -32,8 +32,8 @@ cd darshan
git checkout dev-modular git checkout dev-modular
---- ----
For details on configuring and building the Darshan runtime and utility repositories, For details on configuring, building, and using the Darshan runtime and utility
consult the documentation from previous versions repositories, consult the documentation from previous versions
(http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html[darshan-runtime] and (http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html[darshan-runtime] and
http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-util.html[darshan-util]) -- the http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-util.html[darshan-util]) -- the
necessary steps for building these repositories should not have changed in the new version of necessary steps for building these repositories should not have changed in the new version of
...@@ -121,13 +121,13 @@ component so it is included in the output I/O characterization. ...@@ -121,13 +121,13 @@ component so it is included in the output I/O characterization.
The static initialization approach is useful for modules that do not have function calls The static initialization approach is useful for modules that do not have function calls
that can be intercepted and instead can just grab all I/O characterization data at Darshan that can be intercepted and instead can just grab all I/O characterization data at Darshan
startup or shutdown time. A module can be statically initialized at Darshan startup time startup or shutdown time. A module can be statically initialized at Darshan startup time
by adding it to the `mod_static_init_fns` list at the top of the `lib/darshan-core.c` by adding its initializatin routine to the `mod_static_init_fns` list at the top of the
source file. `lib/darshan-core.c` source file.
*NOTE*: Modules which require static initialization should typically provide a corresponding *NOTE*: Modules may wish to add a corresponding configure option to disable the module
configure option to prevent the module from being built and capturing I/O data. The ability from attempting to gather I/O data. The ability to disable a module using a configure
to disable a module using a configure option is especially necessary for system-specific option is especially necessary for system-specific modules which can not be built or
modules which can not be built or used on other systems. used on other systems.
Most instrumentation modules can just bootstrap themselves within wrapper functions during Most instrumentation modules can just bootstrap themselves within wrapper functions during
normal application execution. Each of Darshan's current I/O library instrumentation modules normal application execution. Each of Darshan's current I/O library instrumentation modules
...@@ -197,7 +197,7 @@ Within darshan-runtime, the darshan-core component manages the initialization an ...@@ -197,7 +197,7 @@ Within darshan-runtime, the darshan-core component manages the initialization an
Darshan environment, provides an interface for modules to register themselves and their data Darshan environment, provides an interface for modules to register themselves and their data
records with Darshan, and manages the compressing and the writing of the resultant I/O records with Darshan, and manages the compressing and the writing of the resultant I/O
characterization. As illustrated in Figure 1, the darshan-core runtime environment intercepts characterization. As illustrated in Figure 1, the darshan-core runtime environment intercepts
`MPI_Init` and `MPI_Finalize` routines to initialize and shutdown the darshan runtime environment, `MPI_Init` and `MPI_Finalize` routines to initialize and shutdown the Darshan runtime environment,
respectively. respectively.
Each of the functions provided by `darshan-core` to interface with instrumentation modules are Each of the functions provided by `darshan-core` to interface with instrumentation modules are
...@@ -212,8 +212,8 @@ void darshan_core_register_module( ...@@ -212,8 +212,8 @@ void darshan_core_register_module(
int *sys_mem_alignment); int *sys_mem_alignment);
The `darshan_core_register_module` function registers Darshan instrumentation modules with the The `darshan_core_register_module` function registers Darshan instrumentation modules with the
`darshan-core` runtime environment. This function needs to be called at least once for any module `darshan-core` runtime environment. This function needs to be called once for any module that
that will contribute data to Darshan's final I/O characterization. will contribute data to Darshan's final I/O characterization.
* _mod_id_ is a unique identifier for the given module, which is defined in the Darshan log * _mod_id_ is a unique identifier for the given module, which is defined in the Darshan log
format header file (`darshan-log-format.h`). format header file (`darshan-log-format.h`).
...@@ -398,8 +398,8 @@ struct darshan_record_ref *, which should be initialized to `NULL` for reading). ...@@ -398,8 +398,8 @@ struct darshan_record_ref *, which should be initialized to `NULL` for reading).
is defined by the `uthash` hash table implementation and includes corresponding macros for is defined by the `uthash` hash table implementation and includes corresponding macros for
searching, iterating, and deleting records from the hash. For detailed documentation on using this searching, iterating, and deleting records from the hash. For detailed documentation on using this
hash table, consult `uthash` documentation in `darshan-util/uthash-1.9.2/doc/txt/userguide.txt`. hash table, consult `uthash` documentation in `darshan-util/uthash-1.9.2/doc/txt/userguide.txt`.
The `darshan-posix-parser` utility (for parsing POSIX module information out of a Darshan log) The `darshan-parser` utility (for parsing module information out of a Darshan log) provides an
provides an example of how this hash table may be used. Returns `0` on success, `-1` on failure. example of how this hash table may be used. Returns `0` on success, `-1` on failure.
[source,c] [source,c]
int darshan_log_getmod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz); int darshan_log_getmod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz);
...@@ -448,9 +448,10 @@ the module's record structure: ...@@ -448,9 +448,10 @@ the module's record structure:
* Add a module identifier to the `DARSHAN_MODULE_IDS` macro at the top of the `darshan-log-format.h` * Add a module identifier to the `DARSHAN_MODULE_IDS` macro at the top of the `darshan-log-format.h`
header. In this macro, the first field is a corresponding enum value that can be used to header. In this macro, the first field is a corresponding enum value that can be used to
identify the module, the second field is a string name for the module, and the third field identify the module, the second field is a string name for the module, the third field is the
is a corresponding pointer to a Darshan log utility implementation for this module (which current version number of the given module's log format, and the fourth field is a corresponding
can be set to `NULL` until the module has its own log utility implementation). pointer to a Darshan log utility implementation for this module (which can be set to `NULL`
until the module has its own log utility implementation).
* Add a top-level header that defines an I/O data record structure for the module. Consider * Add a top-level header that defines an I/O data record structure for the module. Consider
the "NULL" module and POSIX module log format headers for examples (`darshan-null-log-format.h` the "NULL" module and POSIX module log format headers for examples (`darshan-null-log-format.h`
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment