Darshan Release Change Log

* bug fix for linker failures caused when linking external libraries
that use MPI internally (e.g., PETSc). Reported by Bilel Hadri.
* bug fix in mapping of Darshan's MPI_File_read_all wrapper to the
  underlying MPI library call in dynamic linking case

* modify Darshan MPI instrumentation method to intercept both MPI and
  PMPI symbols to workaround MPI implementations that are calling
  PMPI routines directly (e.g., the Fortran bindings of OpenMPI2).
  Contributed in part by Chris Zimmer.
* add a new python utility for analzying DXT trace files (dxt_analyzer).
  Contributed by Alex Sim.
* bug fix to disable Darshan module instrumentation after the
  Darshan shutdown procedure has began, an issue that was
  leading to negative timers in some cases
* bug fix for autoconf ignoring specified libbz2 location. Contributed
  by Glenn Lockwood.
* add regression tests harnesses for Cray systems at ALCF & NERSC
* add support for DARSHAN_EXCLUDE_DIRS environment variable to
  explicitly disable instrumentation for files in given
  directories. Contributed by Cristian Simarro.
* correct "undefined reference to `__wrap_H5get_libversion'" linker failure
  when compiling some HDF5 programs, reported by Jialin Liu
* bug fix in darshan-merge utility related to logs containing DXT trace
  data. Reported by Glenn Lockwood.

* bug fix to prevent darshan-parser segfault when parsing logs with
  DXT module data present. Reported by James Dickson.
* bug fixes to make the darshan-diff utility functional again.

* disable instrumentation for mmap when dynamically linking; this avoids a
  potential deadlock condition on Cray systems using dynamically linked
  executabes.  Reported by Cristian Simarro for dynamic linking case.

* add new DxT instrumentation modules to provide fine-grained read/write
  operation tracing at both the POSIX and MPI-IO layers
    - this functionality should be enabled at runtime by exporting the
      DXT_ENABLE_IO_TRACE environment variable
    - trace output is stored within Darshan's traditional log file format
    - a corresponding trace parser (darshan-dxt-parser) is offered within
      darshan-util to allow the DxT trace modules to be parsed and displayed
    - this software was contributed by Cong Xu and Intel's HPDD division.
* add logic to allow Darshan to capture command line arguments from Fortran
  applications (contributed by Cristian Simarro)
* skip instrumentation attempts for anonymous mmap() calls; this avoids a
  potentential deadlock condition when used with hugepages on Cray systems.
  Reported by Glenn Lockwood for static linking case.
* fix segmentation fault in statistics collection for applications that issue
  operations with a large number of distince access sizes or strides on the
  same file.  Reported by Glenn Lockwood.
* disable HDF5 module by default unless enabled using --enable-HDF5-post-1.10
  or --enable-HDF5-pre-1.10 configure arguments.  These options
  vary the wrapper prototypes to match the corresponding HDF5 library ABI.
  The initial patch for HDF5 1.10 compatibility was contributed by 
  Karl-Ulrich Bamberg.
* modified Darshan's path exclusion logic to include a whitelist to prevent
  I/O to/from Cray's Datawarp service from being filtered out (it is located
  within the /var directory, which historically has been excluded). Reported
  by Glenn Lockwood.
* bug fix in resolving underlying call to fopen64 when using the LD_PRELOAD
  instrumentation mechanism
* bug fix to make sure Lustre module headers are installed when installing
  - reported by Matthieu Dorrier

* bug fix to MPIIO_F_WRITE_START_TIMESTAMP, which may have produced incorrect
  timestamps in some cases
  - reported by Wucherl (William) Yoo

* add stdio I/O library instrumentation module (Philip Carns)
    - this handles instrumentation of file stream I/O functions
      like fopen(), fprintf(), fscanf(), etc.
    - this module also captures stats on the standard streams (stdin,
      stdout, & stderr)
* add Lustre instrumentation module (Glenn Lockwood)
    - this module provides Lustre striping details (e.g., stripe
      width, stripe size, list of OSTs a file is striped over)
* add new mmap-based logging mechanism that allows Darshan to
  generate output logs even in cases where applications don't
  call MPI_Finalize()
    - these logs are uncompressed and are per-process rather
      than per-job
* add the darshan-merge utility to darshan-util to allow per-process
  logs generated by the mmap-based logging mechanism to be converted
  into Darshan's traditional compressed per-job log files
* augment the POSIX module timestamp counters to also include a
  LAST_OPEN & FIRST_CLOSE counters to give more details on application
  I/O intervals
* avoid saving duplicate mount point entries in Darshan log files

* bug fix in darshan logutil mount parsing code that was
  causing file paths to be matched to the first mount point
  with a common prefix rather than the one with the longest
  common prefix
* bug fix in the darshan-util bzip2 configure check that
  was accidentally overwriting Darshan's LDFLAGS
* minor bug fixes to IO start time counters in all modules
  to set IO start time to the actual first start time rather
  than the first IO op to complete
* update darshan-util perl scripts to get perl bin from
  user's path, rather than from /usr/lib (reported by
  Kay Thust)
* update Darshan's fortran and cxx compiler wrapper generators
  to automatically detect MPICH library names on BG/Q
* fix bug that was calculating Darshan's agg_perf_by_slowest
  performance metric incorrectly
* add performance estimate info to darshan-job-summary

134 135 136 137
* install darshan-null-log-format.h header when installing
  darshan-util component, otherwise compiler errors are
  generated when building external tools that use
* update docs to give debugging tips for cases where
  Darshan logs are not generated
* fix shared library regression test script to check for
  potential errors with Darshan symbols rather than
  failing silently in these cases
* bug fix for determining minimum non-zero counters in
shared file reductions in all modules
* loosen Darshan's PMPI symbol check to prevent inadvertent
  disabling of Darshan for some MPICH builds
* update runtime docs to give information on upgrading Darshan
* bug fix for resolving MPI_Gather and MPI_Barrier when LDPRELOADing
  Darshan's shared libraries (reported by Richard Hedges and Rob Latham)
* add more helpful error handling when opening 2.x version log files
* port darshan-diff utility over to new log file format
* fix numerous configure bugs on Cray systems
* add synthetic benchmarking hooks for testing Darshan's shutdown

* add module-specific version fields to header to allow utilities
  to handle different versions of a module's I/O data for backwards
  compatibility -- NOTE: this breaks the log file parsing for logs
  obtained using Darshan-3.0.0-pre2 & Darshan-3.0.0-pre1 
* bug fix in regression test scripts for setting proper environment
  variables to use MPI profiling configuration for Fortran apps
* bug fix in bzip2 log writing implementation in darshan-logutils
* possible race conditions resolved in each module's shutdown code
* general code, comment, and documentation cleanup
* addition of module-specific counter descriptions printed prior
  to parsing a modules I/O data in darshan-parser

* add fix to install appropriate headers for linking external
  applications with darshan-util (reported by Matthieu Dorier)
* add darshan-util Ruby bindings for the new modularized version
  of Darshan (3.0) (Matthieu Dorier)
* add enhancement to darshan-runtime to allow per-module instrumentation
  memory to be user configurable using a configure option or a runtime
  environment variable

* new version of Darshan with the following features/improvements:
    - hooks for developers to add their own instrumentation module
      implementations to capture new I/O characterization data
        - these instrumentation modules can be used to instrument new
          I/O interfaces or gather system-specific parameters, for instance
    - modularized log format allows new module-specific utilities to
      access their I/O characterization data independently
        - this new format also allows new counters to be added to existing
          instrumentation modules without breaking existing utilities
    - Darshan logs now contain a mapping of Darshan's unique record
      identifiers to full file names, instead of fix-sized file name
    - a new instrumentation module for capturing BG/Q-specific parameters
      (BG/Q environment is automatically detected at configure time)
      (implemented by Kevin Harms)
    - new darshan-parser and darshan-job-summary output to utilize the
      new modularized log format
* updated documentation outlining changes in this release, as well as
  steps for adding new instrumentation modules is given in the top-level
  'doc' directory.
    - documentation for configuring and using the darshan-runtime and
      darshan-util components are mostly the same and still located in
      their respective directories ('darshan-runtime/doc' and 'darshan-util/doc')

* Fix gnuplot version number check to allow to work
  with gnuplot 5.0 (Kay Thust)
* Fix function pointer mapping typo in lio_listio64 wrapper (Shane Snyder)
* Fix faulty logic in extracting I/O data from the aio_return 
  wrapper (Shane Snyder)
* Fix bug in common access counter logic (Shane Snyder)
* Expand and clarify darshan-parser documentation (Huong Luu)

* added documentation and example configuration files for using the -profile
  or $MPICC_PROFILE hooks to add instrumentation to MPICH-based MPI
  implementations without generating custom wrapper scripts
* Add wrappers for mkstemp(), mkostemp(), mkstemps(), and mkostemps()
  (reported by Tom Peterka)
* Change OPEN_TIMESTAMP field to report timestamp right before open() is
  invoked rather than after timestamp after open is completed.
  NOTE: updated log format version to 2.06 to reflect this change.
* Change start_time and end_time fields in job record to use min and max
  (respectively) across all ranks
* Fix bug in write volume data reported in file system table in (reported by Matthieu Dorier)
* Clean up autoconf test for zlib and make zlib mandatory (reported by Kalyana
* add --start-group and --end-group notation to Darshan libraries for Cray PE
  2.x environment to fix link-time corner cases (Yushu Yao)
* improve y axis labels on time interval graphs in
  (reported by Tom Peterka)
* misc. improvements to darshan-parser --perf output (reported by Shane
  - indicate which rank was slowest in unique file results
  - label I/O vs. meta time more clearly
  - include unique file meta time in agg_perf_by_slowest calculation
* added regression test script framework in darshan-test/regression/
  - currently support platforms include:
    - Linux with static linking and generated compiler wrappers
    - Linux with static linking and profiler configuration files
    - Linux with dynamic linking and LD_PRELOAD
    - Blue Gene/Q with static linking and profiler configuration files
* update and to support new library
  naming conventions in MPICH 3.1.1 and higher
* update documentation to reflect known issues with some versions of MPICH
* Cray platforms: modify darshan-runtime so that link-time instrumentation 
  options are only used when statically linking via Libs.private.  
  (reported by Kalyana Chadalavada)

* Fix incorrect version numbering in darshan-runtime component of Darshan
  2.2.9, reported by Jean-Guillaume Piccinali

* Bug fixes:
  - Fix mnt table overflow if a large number of file systems are mounted, 
    reported by David Shrader.
  - Fix argument parsing for darshan-convert, reported by Mouhamed Gueye.
  - Fix metadata annotation overflow in darshan-convert, reported by 
    Mouhamed Gueye.
  - Fix const-correctness in dynamic library when built against MPI 3.x
269 270
  - Fix "undefined symbol: dlsym" error when using preloaded dynamic library
    on some platforms, reported by Florin Isaila.
  - Normalize timestamps to always be relative to MPI_Init().
  - Better library name matching in compiler wrappers to handle more MPICH
    variations on Blue Gene systems.
274 275
    programs (Shane Snyder).
* Enhancements:
  - Add support (both in documentation and in provided module files) for
    Cray PE 2.x.
  - Honor CC variable to allow darshan-util to be built with other compilers
    besides gcc.
281 282 283
    LD_PRELOAD when instrumenting dynamic libraries, issue reported and
    investigated by Davide Del Vento.
284 285
  - Ability to disable shared-file reduction by setting the
  - More thorough output from darshan-parser --perf, suggested by Huong Luu.
  - Increased metadata annotation room from 64 bytes to 1KiB in header.
288 289
  - CP_F_{FASTEST/SLOWEST}_RANK_TIME counters now take MPI-IO time into
    account, not just POSIX time, issue reported by Huong Luu.
  - Better handling of systems with many mounted file systems (after which
    point Darshan will assume file resides on / file system), issue reported
    and investigated by David Shrader:
    - Track up to 64 rather than 32 mounted file systems at runtime.
    - Increase header space available for storing mount point information in
      log file from approximately 1 KiB to approximately 3 KiB.
    - Prioritize storing information about non-NFS volumes over NFS volumes
      if too many file systems are mounted to record them all.
  - Added darshan-util pkgconfig file (Shane Snyder).
  - Added --enable-shared configure option to darshan-util to build and
    a shared library version of libdarshan-util
* WARNING: please note that the Darshan module file for Cray environments has
  been updated, especially in the DARSHAN_POST_LINK_OPTS variable.  Please
  update your module file accordingly when upgrading from 2.2.7 or earlier 
  on Cray platforms.
Philip Carns's avatar
Philip Carns committed
* Improved ability to analyze I/O activity related to particular files 
  opened by an application
  - script to generate a separate pdf summary for
Philip Carns's avatar
Philip Carns committed
    each file opened by an application. Developed by Rob Latham.
  - Added --file-list and --file-list-detailed options to darshan-parser to 
Philip Carns's avatar
Philip Carns committed
    list files opened by an application along with brief statistics
  - Added --file option to darshan-convert to filter out activity for a
    specific file from a Darshan log
316 317 318
* Add wrappers for POSIX AIO operations (fixes tracking of underlying POSIX
  operations resulting from nonblocking MPI-IO operations.  Bug reported by
  David Shrader.)
319 320 321 322 323 324
* Fix compile-time errors when Darshan is configured for use with MPICH 3.x 
  MPICH 1.5+ installations with optional const support.  Reported by Yushu Yao.
* Fix segmentation fault when using LD_PRELOAD instrumentation on programs
  that use MPI_Init_thread() rather than MPI_Init().  Reported by Myriam

* Updated Cray installation documentation for cleaner integration in Cray
Philip Carns's avatar
Philip Carns committed
330 331 332 333
* Fix bug that recorded incorrect device ID (and therefore incorrect mount
  point mapping) if stat() was called before open() on a file
* Store version number of the darshan runtime library in the log file 
  metadata (see lib_ver in darshan-parser output)
* Bug fixes:
  - make sure to honor user-specified hints passed in at runtime
    via the DARSHAN_LOGHINTS env variable.
  - include fread and fwrite in read and write counts
  - fix segmentation fault on invalid arguments to darshan-parser
342 343 344 345 346 347 348 349 350 351 352 353 354
  - collect mount point information at rank 0 and broadcast to all 
    processes to avoid excess file system traffic on startup
  - change default MPI-IO hints for writing log file to
    romio_no_indep_rw=true and cb_nodes=4 to improve log creation performance
* Install libdarshan-util and headers during installation process for
* Detect PMPI support at link time when using compiler scripts produced
  by the darshan-gen-* utilities.  This avoids link problems when Darshan
  compiler wrappers are used with the ADIOS dummy MPI library; reported
  by Jingqing Mu
* Support MPICH_{CC/CXX/F77} environment variables in compiler scripts
355 356
  environment variable
Philip Carns's avatar
* Rename cp-shutdown-bench test utility to darshan-shutdown-bench and enable
  benchmarking hooks in library by default so that darshan-shutdown-bench 
  can be used with any Darshan installation
* Remove deprecated --enable-st-dev-workaround configure option
364 365 366 367
* Fix bug in mount point identification when --enable-stat-at-open option is
  not used.  In Darshan 2.2.4, some file entries were recorded as using the "/"
  file system regardless of their location.
368 369
* Update patches and documentation for Cray xt-asyncpe environment 5.12 or 
  higher; contributed by Yushu Yao.
370 371
  any systems
372 373
  compiler.  Reported by Yushu Yao.
374 375 376 377 378 379 380 381
  --enable-stat-at-open option is used
* Added --enable-group-readable-logs configure option, which will cause
  Darshan to to generate log files with the group read permission bit set.
  This option is useful in conjunction with deployments that set the setgid
  bit on log directories.

* Disable extra stat() of newly opened files by default.  This improves 
  performance on shared files for some platforms.  Reported by Yushu Yao.
387 388 389
390 391
* Fix missing -lz in post ld flags reported by Yushu Yao.  Fixes a link-time
  error for some corner-case applications.
392 393
* Fix bug in Cray compiler script patches that was setting compiler flags 
  incorrectly.  Reported by Yushu Yao.
394 395
  WARNING: if you are using a previous Darshan release (2.2.3 through
  2.2.4-pre4) on a Cray platform, please re-patch your compiler scripts.
396 397 398 399
* Update darshan-gen-* scripts to support the potential for additonal LDFLAGS
  link commands.  This fixes compatibility with some mvapich2 installations,
  reported by Dragos Constantin.

402 403 404 405
* improved Cray XE6 support
  - support for GNU, PGI, Cray, Pathscale, and Intel compilers
  - patch adding Darshan capability to system compiler scripts
406 407 408
  - software module, including testing and features contributed 
    by Yushu Yao and Katie Antypas of The National Energy Research 
    Scientific Computing Center (NERSC)
  - improved documentation
410 411 412 413
  - properly detect cxx library name when generating BG/Q compiler wrappers
  - improve hashing to avoid log file name collisions
415 416 417 418 419 420 421 422
* bug fixes:
  - remove debugging message that was inadvertently included in
    MPI_File_sync() wrapper
  - fix potential hang if the --with-log-path-by-env argument was used at 
    configure time but the environment variable was not set at run time

424 425 426
* significant improvements to how counters are handled in multi-threaded
* initial (rough) documentation for using Darshan in Cray
  programming environments with static linking
* bug fixes:
430 431
  - escape special characters in mount point paths in
    (reported by Mouhamed Gueye)
432 433 434 435 436
  - workarounds for various runtime problems with cuserid() and stat() in  
  - build problems with darshan-utils on some versions of OSX
  - accurate shared file statistics for libraries that use deferred opens

* split darshan into separate packages:
  - darshan-runtime: for runtime instrumentation
  - darshan-utils: for processing darshan log files
* changed default output file name for to be based on
  input file name rather than summary.pdf
  allow for easier integration with other instrumentation tools)
* add -cc, -cxx, -f77, -f90, and -fc support to compiler scripts generated by
  the darshan-gen-*.pl scripts
* bug fixes:
  - potential MAX_BYTE overflow on 32 bit systems
  - incorrect pread and pwrite offset tracking
  - corrections to darshan-job-summary variance table
  - better runtime error handling if bzip or gnuplot tools are insufficient
  - improvements to time range in darshan-job-summary graphs
454 455 456 457
* documentation:
  - improved documentation for both the darshan-runtime and darshan-util
  portions of Darshan can be found in the respective doc/ subdirectory for

459 460 461 462 463
* improved error handling when writing log files.  If a write fails then the
  log file will be deleted and a warning will be printed to stderr.

466 467 468 469 470 471 472 473 474
* new darshan-convert command line utility for converting existing log files,
* bzip2 support in command line utilities (but not in the darshan library
* updated log file format that allows for string key/value pairs to be stored
  in the header
  at configure time: --with-log-hints
  at run time: DARSHAN_LOGHINTS environment variable
475 476
  symbols in Fortran wrapper script
Philip Carns's avatar
* performance bug fix: remove unecessary call to MPI_File_set_size when
  writing log
* added --with-logpath-by-env configure option to allow absolute 
  log path to be specified via environment variable

482 483 484 485 486 487 488 489 490 491 492
* additinoal environment variables to control log, jobid and
  alignment parameters
* bug fixes for darshan-parser --perf calucations
* support for MPI1.x
* support for OpenMPI
* support for PGI, Intel compilers
493 494 495 496 497 498 499 500 501 502 503 504 505
* added a random identifier to job logs (to avoid collisions from multiple
  application instances within a single scheduler job)
* improved installation and library path management for
* improved error handling in
* additional derived statistics categories for darshan-parser output:
    --all   : all sub-options are enabled
    --base  : darshan log field data [default]
    --perf  : derived perf data
    --total : aggregated darshan field data

508 509 510
* bug fix to variance/minimum calculations on shared files
  darshan-gen-* tools
511 512 513 514
* new run time environment variable: DARSHAN_INTERNAL_TIMING.  If set at job
  execution time, it will cause Darshan to time its own internal data 
  aggregation routines and print the results to stdout at rank 0.

* new output file format that is portable across architectures
  NOTE: Darshan 1.x output files are incompatible with the tools in this 
  release unless they were generated on a ppc32 architecture (Blue Gene)
  opened each shared file, along with the number of seconds and number of
  bytes consumed by those processes.  It also reports the variance in both
  time and amount of data.
  from different schedulers
* job ID is now recorded within the Darshan log in addition to in the file
  * opens output files directly without using intermediate darshan-parser output
  * table showing data usage per file system
  * table showing I/O variance in shared files
532 533 534
* Fixes for bugs reported by Noah Watkins: 
  * avoid name collision in hashing function
  * divide by zero error in

536 537 538 539
* fixed erroneous incompatibility warning when opening old logs in darshan-parser

540 541 542 543
544 545 546 547 548 549 550
* improved mapping of file records to mount points
* new page in output showing timelines of file access
* checkpoint/restart ability in parallel fsstats script

* Bug fix for lseek, pread, and pwrite when used in 32 applications without large file support
* Improved experimental script
* Added new experimental script

557 558 559 560 561
* Bug fix for files that are accessed with stat() but never opened
* Workaround zlib problems with 64 bit offsets on 32 bit architectures

562 563 564 565 566 567 568
* Added "fast" version of each BG/P compiler
* Added experimental scripts in test directory to run fsstats in parallel
* Kevin Harms: Added experimental utilities for loading darshan results into SQL
* Use exclusive flag when opening output file (to protect against file name collision)
* Rob Ross: updates to allow command line tools to build on Darwin
* Bug fix for pnetcdf configure problem reported by Rob Latham; darshan now always pulls in MPI_Wtime() symbol at link time

571 572 573
* Added tracking of file system type and mount point for each file
574 575 576
* Added tracking of file size at open time (CP_SIZE_AT_OPEN)
* Moved sync cost to be counted in cumulative write time rather than cumulative metadata time
* Added sync as a separate category in counters
577 578 579 580 581 582
* Bug fix to most frequent access size table in
* Converted all utilities to use darshan-logutils api for reading output files
* Added backwards compatibility to darshan-logutils routines
* Kevin Harms: Added darshan-analyzer utility to summarize usage of MPI-IO, pNetCDF, HDF5, and shared files across a set of output files
* Fixed bug field listing for darshan-diff utility

583 584 585 586
* Minor fix for a compile warning

587 588 589 590
* Kevin Harms: bug fix for segfault in apps that use MPI_Init_thread()

591 592 593 594 595
* Limit PMPI usage in library to fewer functions
* Update PMPI detection in compiler scripts to ignore functions unused by Darshan

596 597 598 599 600 601 602 603 604 605
* Track files opened via Parallel NetCDF
* Track files opened via HDF5
* Record slowest individual POSIX read and write times along with access size for those operations
* Inspect symbols at compile time to determine whether to enable Darshan or not based on the presence of MPI and PMPI symbols
* Use GNU and IBM compilers from path rather than hard coded location
* Simplify warning message if unable to open log file
* Remove unused internal benchmark routines

* Updated compiler scripts for V1R4 driver on BlueGene/P

610 611 612 613
* Added *_r versions of each IBM compiler script on BlueGene/P

614 615 616 617 618 619 620 621 622
* Set default permissions to 0400 (user read only) for output files
* Automatically disable Darshan at link time if common PMPI libraries are detected in the command line
* Experimental tool ( to automatically generate Darshan-enabled mpicc scripts

* Initial public release