darshan-util.txt 4.2 KB
Newer Older
1 2 3 4 5
Darshan-util installation and usage
===================================

== Introduction

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
This document describes darshan-util, which is a collection of tools that
aid in parsing and summarizing log files produced by the darshan-runtime
instrumentation.  The two packages can be installed independently.
For example, you may wish to install darshan-util by itself on a
workstation in order to analyze logs that were produced on a separate
HPC system.  Darshan log files are platform-independent and can be
processed on any architecture.

== Requirements

Darshan-util has only been tested in Linux environments, but will likely
work in other Unix-like environments as well.  

.Hard requirements
* C compiler
* zlib development headers and library (zlib-dev or similar)

.Optional requirements
* libbz2 development headers and library (libbz2-dev or similar)
* Perl
* pdflatex
* gnuplot 4.2 or later
* epstopdf

== Compilation

.Configure and build example
----
tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-util
./configure
make
make install
----

You can specify `--prefix` to install darshan-util in a specific location
(such as in your home directory for non-root installations).  See
`./configure --help` for additional optional arguments, including how to
specify alternative paths for zlib and libbz2 development libraries.
darshan-util also supports VPATH or "out-of-tree" builds if you prefer that
method of compilation.

== Analyzing log files

Each time a darshan-instrumented application is executed, it will generate a
single log file summarizing the I/O activity from that application.  See the
darshan-runtime documentation for more details, but the log file for a given
application will likely be found in a centralized directory, with the path
and log file name in the following format:

----
<YEAR>/<MONTH>/<DAY>/<USERNAME>_<BINARY_NAME>_<JOB_ID>_<DATE>.darshan.gz
----

This is a binary format file that summarizes I/O activity. As of version
2.0.0 of Darshan, this file is portable and does not have to be analyzed on
the same system that executed the job. 

=== darshan-job-summary.pl

You can generate a graphical summary
of this I/O activity by using the `darshan-job-summary.pl` graphical summary
tool as in the following example:

----
darshan-job-summary.pl carns_my-app_id114525_7-27-58921_19.darshan.gz
----

This utility requires Perl, pdflatex, epstopdf, and gnuplot in order to
generate its summary.  By default, the output is written to a multi-page
pdf file based on the name of the input file (in this case it would
produce a `carns_my-app_id114525_7-27-58921_19.pdf` output file).
You can also manually specify the name of the output file using the
`--output` argument.

=== darshan-parser

In order to obtained a full, human readable dump of all information
contained in a log file, you can use the `darshan-parser` command
line utility.  It does not require any additional command line tools.
The following example essentially converts the contents of the log file
into a fully expanded text file:

----
darshan-parser carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
----

The format of this output is described in the following section

=== Guide to darshan-parser output

97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
The beginning of the output from darshan-parser displays a summary of
overall information about the job. The following table defines the meaning
of each line:

[cols="25%,75%",options="header"]
|====
|output line | description
| "# darshan log version" | internal version number of the Darshan log file
| "# size of file statistics" | uncompressed size of each file record in the binary log file
| "# size of job statistics" |  uncompressed size of the overall job statistics in the binary log file
| "# exe" | name of the executable that generated the log file
| "# uid" | user id that the job ran as
| "# jobid" | job id from the scheduler
| "# start_time" | start time of the job, in seconds since the epoch
| "# start_time_asci" | start time of the job, in human readable format
| "# end_time" | end time of the job, in seconds since the epoch
| "# end_time_asci" | end time of the job, in human readable format
| "# nprocs" | number of MPI processes
| "# run time" | run time of the job in seconds
|====

118
TODO: pick up here.