user-defined data block within darshan logs
I would like the ability to insert new records/modules (or even arbitrary data, like HDF5's user block) into Darshan logs after a log has been generated. While this would probably be dangerous to do in production, it would be a more portable way for downstream analysis to attach indices or derived quantities to existing logs so they don't have to be recalculated.
A specific use case that I've had is generating the results of darshan-parser --perf
only once per log and then caching that information somewhere. I resorted to storing these summary metrics as extended attributes associated with the Darshan log file, but xattrs tend to disappear when a file is transferred across different systems or batched up into tar-like formats.
Another potential use case would be to plug into a framework like TOKIO and insert additional performance data that came from sources outside of the scope of the job. A facility could add value to users' Darshan logs by putting server-side I/O load data into the Darshan log after the job has completed, giving the user a single über-log that contains everything a the center knows about I/O that is relevant to that job.