Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • D darshan
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 72
    • Issues 72
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • darshan
  • darshan
  • Issues
  • #317
Closed
Open
Issue created Apr 22, 2021 by Shane Snyder@ssnyderOwner

crashes due to interactions between OMPIO, HDF5, and Darshan

OpenMPI applications (4.0.5 and 4.1.0 tested) that generate zero-length file views and also use Darshan MPI-IO instrumentation (versions 3.2.0+) will crash in a call to MPI_File_get_byte_offset() made by Darshan wrappers.

This access pattern can be triggered by creating zero-length dataspaces in HDF5, as well, which prompted the initial bug report.

Bug appears to be in OpenMPI's OMPIO driver, and they are currently investigating. The ROMIO driver works fine.

We will investigate whether we can disable the MPI_File_get_byte_offset() calls for OpenMPI applications to help protect against this.

See for more details (and a reproducer not relying on Darshan or HDF5 code):

https://forum.hdfgroup.org/t/parallel-hdf5-write-with-irregular-size-in-one-dimension/8284/11

https://github.com/open-mpi/ompi/issues/8841

Edited Apr 22, 2021 by Shane Snyder
Assignee
Assign to
Time tracking