DarshanRecordCollections are introduced to allow memory optimizations while maintaining a stable public API for data export to common formats such as Pandas, Numpy, or JSON.
Using DarshanRecordCollection in particular help to:
- Stabilize public APIs while allowing various optimizations internally
- Simplify plotting/summarizing for common reporting via the export functions, in particular
to_df()
- Consolidate export conversion logic into fewer places (e.g.,
to_json
,to_df
) - Allow pretty display representation in jupyter notebooks
- Reduce metadata redundancies and thus conserve memory (e.g., record collection across shared rank, id/nrec, etc.)
- Allow changing and mixing internal representations as most suitable for record type
- Allow to introduce various record indexing strategies later on