Commit f6fbcec4 authored by Shane Snyder's avatar Shane Snyder

Merge remote-tracking branch 'origin/master' into dev-modular

Conflicts:
	ChangeLog
	darshan-runtime/configure
	darshan-runtime/configure.in
	darshan-runtime/darshan.h
	darshan-runtime/lib/darshan-posix.c
	darshan-util/configure
	darshan-util/configure.in
	darshan-util/doc/darshan-util.txt
parents 9274a0db 83f12d5f
......@@ -27,6 +27,16 @@ Darshan-3.0.0-pre1
darshan-util components are mostly the same and still located in
their respective directories ('darshan-runtime/doc' and 'darshan-util/doc')
darshan-2.3.2-pre1
=============
* Fix gnuplot version number check to allow darshan-job-summary.pl to work
with gnuplot 5.0 (Kay Thust)
* Fix function pointer mapping typo in lio_listio64 wrapper (Shane Snyder)
* Fix faulty logic in extracting I/O data from the aio_return
wrapper (Shane Snyder)
* Fix bug in common access counter logic (Shane Snyder)
* Expand and clarify darshan-parser documentation (Huong Luu)
darshan-2.3.1
=============
* added documentation and example configuration files for using the -profile
......
The Darshan source tree is divided into two parts:
Darshan is a lightweight I/O characterization tool that transparently
captures I/O access pattern information from HPC applications.
Darshan can be used to tune applications for increased scientific
productivity or to gain insight into trends in large-scale computing
systems.
Please see the
[Darshan web page](http://www.mcs.anl.gov/research/projects/darshan)
for more in-depth news and documentation.
The Darshan source tree is divided into two main parts:
- darshan-runtime: to be installed on systems where you intend to
instrument MPI applications. See darshan-runtime/doc/darshan-runtime.txt
......@@ -8,9 +18,7 @@ The Darshan source tree is divided into two parts:
log files produced by darshan-runtime. See
darshan-util/doc/darshan-util.txt for installation instructions.
General documentation can be found on the Darshan documentation web page:
http://www.mcs.anl.gov/darshan/documentation/
The darshan-test directory contains various test harnesses, benchmarks,
patches, and unsupported utilites that are mainly of interest to Darshan
developers.
#!/usr/bin/perl -w
# This script will go through all of the darshan logs in a given
# subdirectory and summarize a few basic statistics about data usage and
# performance, producing a text file with text in columns
#<jobid> <version> <start ascii> <end ascii> <start unix> <end unix> <nprocs> <bytes read> <bytes written> <perf estimate>
use strict;
use File::Find;
sub wanted
{
my $file = $_;
my $line;
my $version = 0.0;
my $nprocs = 0;
my $start = 0;
my $end = 0;
my $start_a = "";
my $end_a = "";
my $jobid = 0;
my $bytes_r = 0;
my $bytes_w = 0;
my $perf = 0.0;
my @fields;
my $mpi_coll_count = 0;
my $mpi_indep_count = 0;
my $posix_count = 0;
# only operate on darshan log files
$file =~ /\.darshan\.gz$/ or return;
# grab jobid from name, old logs don't store it in the file
if($file =~ /_id(\d+)_/) {
$jobid = $1;
}
if(!(open(SUMMARY, "darshan-parser --file-list-detailed $file |")))
{
print(STDERR "Failed to parse $File::Find::name\n");
return;
}
while ($line = <SUMMARY>) {
if($line =~ /^#/) {
next;
}
if($line =~ /^\s/) {
next;
}
@fields = split(/\s/, $line);
if($#fields == 34)
{
if($fields[13] > 0){
$mpi_coll_count ++;
}
elsif($fields[12] > 0){
$mpi_indep_count ++;
}
elsif($fields[14] > 0){
$posix_count ++;
}
}
}
print(STDOUT "$jobid\t$mpi_coll_count\t$mpi_indep_count\t$posix_count\n");
close(SUMMARY);
}
sub main
{
my @paths;
if($#ARGV < 0) {
die("usage: darshan-gather-stats.pl <one or more log directories>\n");
}
@paths = @ARGV;
print("# <jobid>\t<#files_using_collectives>\t<#files_using_indep>\t<#files_using_posix>\n");
print("# NOTE: a given file will only show up in one category, with preference in the order shown above (i.e. a file that used collective I/O will not show up in the indep or posix category).\n");
find(\&wanted, @paths);
}
main();
# Local variables:
# c-indent-level: 4
# c-basic-offset: 4
# End:
#
# vim: ts=8 sts=4 sw=4 expandtab
......@@ -131,11 +131,10 @@ specified file.
=== darshan-parser
In order to obtained a full, human readable dump of all information
contained in a log file, you can use the `darshan-parser` command
line utility. It does not require any additional command line tools.
The following example essentially converts the contents of the log file
into a fully expanded text file:
You can use the `darshan-parser` command line utility to obtain a
complete, human-readable, text-format dump of all information contained
in a log file. The following example converts the contents of the
log file into a fully expanded text file:
----
darshan-parser carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
......@@ -146,8 +145,14 @@ The format of this output is described in the following section.
=== Guide to darshan-parser output
The beginning of the output from darshan-parser displays a summary of
overall information about the job. The following table defines the meaning
of each line:
overall information about the job. Additional job-level summary information
can also be produced using the `--perf`, `--file`, `--file-list`, or
`--file-list-detailed` command line options. See the
<<addsummary,Additional summary output>> section for more information about
those options.
The following table defines the meaning
of each line in the default header section of the output:
[cols="25%,75%",options="header"]
|====
......@@ -365,6 +370,7 @@ value of 1 MiB for optimal file alignment.
|====
==== Additional summary output
[[addsummary]]
The following sections describe addtitional parser options that provide
summary I/O characterization data for the given log.
......@@ -373,8 +379,7 @@ summary I/O characterization data for the given log.
===== Performance
Use the '--perf' option to get performance approximations using four
different computations.
Job performance information can be generated using the `--perf` command-line option.
.Example output
----
......@@ -407,6 +412,54 @@ different computations.
# agg_perf_by_slowest: 2206.983935
----
The `total_bytes` line shows the total number of bytes transferred
(read/written) by the job. That is followed by three sections:
.I/O timing for unique files
This section reports information about any files that were *not* opened
by every rank in the job. This includes independent files (opened by
1 process) and partially shared files (opened by a proper subset of
the job's processes). The I/O time for this category of file access
is reported based on the *slowest* rank of all processes that performed this
type of file access.
* unique files: slowest_rank_io_time: total I/O time for unique files
(including both metadata + data transfer time)
* unique files: slowest_rank_meta_time: metadata time for unique files
* unique files: slowest_rank: the rank of the slowest process
.I/O timing for shared files
This section reports information about files that were globally shared (i.e.
opened by every rank in the job). This section estimates performance for
globally shared files using four different methods. The `time_by_slowest`
is generally the most accurate, but it may not available in some older Darshan
log files.
* shared files: time_by_cumul_*: adds the cumulative time across all
processes and divides by the number of processes (inaccurate when there is
high variance among processes).
** shared files: time_by_cumul_io_only: include metadata AND data transfer
time for global shared files
** shared files: time_by_cumul_meta_only: metadata time for global shared
files
* shared files: time_by_open: difference between timestamp of open and
close (inaccurate if file is left open without I/O activity)
* shared files: time_by_open_lastio: difference between timestamp of open
and the timestamp of last I/O (similar to above but fixes case where file is
left open after I/O is complete)
* shared files: time_by_slowest : measures time according to which rank was
the slowest to perform both metadata operations and data transfer for each
shared file. (most accurate but requires newer log version)
.Aggregate performance
Performance is calculated by dividing the total bytes by the I/O time
(shared files and unique files combined) computed
using each of the four methods described in the previous output section. Note the unit for total bytes is
Byte and for the aggregate performance is MiB/s (1024*1024 Bytes/s).
===== Files
Use the `--file` option to get totals based on file usage.
The first column is the count of files for that type, the second column is
......@@ -416,9 +469,14 @@ accessed.
* total: All files
* read_only: Files that were only read from
* write_only: Files that were only written to
* read_write: Files that were both read and written
* unique: Files that were opened on only one rank
* shared: File that were opened by more than one rank
Each line has 3 columns. The first column is the count of files for that
type of file, the second column is number of bytes for that type, and the third
column is the maximum offset accessed.
.Example output
----
# files
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment