Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • D darshan
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 72
    • Issues 72
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • darshan
  • darshan
  • Merge requests
  • !86

PyDarshan: Record Collections

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Jakob Luettgau requested to merge jluettgau/darshan:pydarshan-RecordCollection into master Mar 15, 2021
  • Overview 2
  • Commits 33
  • Changes 18

DarshanRecordCollections are introduced to allow memory optimizations while maintaining a stable public API for data export to common formats such as Pandas, Numpy, or JSON.

Using DarshanRecordCollection in particular help to:

  • Stabilize public APIs while allowing various optimizations internally
  • Simplify plotting/summarizing for common reporting via the export functions, in particular to_df()
  • Consolidate export conversion logic into fewer places (e.g., to_json, to_df)
  • Allow pretty display representation in jupyter notebooks
  • Reduce metadata redundancies and thus conserve memory (e.g., record collection across shared rank, id/nrec, etc.)
  • Allow changing and mixing internal representations as most suitable for record type
  • Allow to introduce various record indexing strategies later on
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: pydarshan-RecordCollection