ENH: tracking issue for py-darshan and CLI modernization
Brief description of deliverable:
Re-architect Darshan analysis tools using Python to be both more user friendly and more amenable to learning methods (LANL).
More detailed tasks:
-
darshan-job-summary.pl
-> develop a command line utility in Python to replace this by leveraging i.e.,py-darshan
under the hood- we will definitely want to retain some kind of basic PDF output because that has worked well over the years; however, the format of the PDF can change somewhat--the team has a pretty good idea of which parts of the PDF should be retained and which can be removed/improved
- for the most part, the expectation is that users will run the command-line utility "locally," so we should not feel bound to the limited Python installations/deps available on typical/restrictive HPC platforms
- the perl infrastructure is apparently fairly slow--it uses regex on plain text output, while the Python implementation should be able to use the C-bindings to probe the binary data directly
- the perl script doesn't really have substantial command line arguments apart from provision of the log file so we don't have to worry too much about preserving old behavior for command-line interaction
- currently
pdflatex
andgnuplot
are involved in the PDF generation, but we are not bound to that at all - we noted that generally disabling "shared records" via the runtime environment variable was most useful for dissecting complicated problems that require per-rank dissection of activity; having the CLI able to handle both types of data granularity would be useful (perhaps selectively disabling certain outputs based on the log data format)
- there is also interest in supporting the extra tracing data available from
DXT
;DXT
captures every system call whiledarshan
mostly provides application-level stats;DXT
becomes important when the plaindarshan
results don't suffice for diagnosing the problem (but there is an overhead/data size cost involved) - are there tests for the perl CLI?
-
py-darshan
improvements proper- we want the data structures provided here to be suitable for i.e., modern machine learning applications--there is still too much post-processing required for end user applications like this
- there should be at least some
pytest
testing infrastructure we can build on here - the user count is currently low enough that we can likely break the API to some extent if needed