"Report aggregations and summarization remains**experimental** for now, mostly to allow interfaces to stabilize. But experimental features can be switched on easily by invoking `darshan.enable_experimental()`:"
"Report aggregations and summarization remains**experimental** for now, mostly to allow interfaces to stabilize. But experimental features can be switched on easily by invoking `darshan.enable_experimental()`:"
]
},
{
...
...
%% Cell type:markdown id: tags:
# DarshanUtils for Python
This notebook gives an overwiew of features provided by the Python bindings for DarshanUtils.
%% Cell type:markdown id: tags:
By default all records, metadata, available modules and the name records are loaded:
By default all records, metadata, available modules and the name records are loaded when opening a Darshan log:
# report.metadata # dictionary with raw metadata from darshan log
# report.modules # dictionary with raw module info from darshan log (need: technical, module idx)
# report.name_records # dictionary for resovling name records: id -> path/name
# report.records # per module "dataframes"/dictionaries holding loaded records
```
%% Cell type:markdown id: tags:
The darshan report holds a variety of namespaces for report related data. All of them are also referenced in `report.data` at the moment, but reliance on this internal organization of the report object is discouraged once the API stabilized. Currently, `report.data` references the following information:
# visualization helper used by different examples in the remainder of this notebook
fromIPython.displayimportdisplay,HTML
# usage: display(obj)
```
%% Cell type:markdown id: tags:
### Record Formats and Selectively Loading Records
For memory efficiant analysis, it is possible to supress records from being loaded automatically. This is useful, for example, when analysis considers only records of a particular layer/module.
%% Cell type:code id: tags:
``` python
importdarshan
report=darshan.DarshanReport("example-logs/example.darshan",read_all=False,lookup_name_records=True)# Loads no records!
```
%% Cell type:code id: tags:
``` python
# expected to fail, as no records were loaded
try:
print(len(report.records['STDIO']),"records loaded for STDIO.")
except:
print("No STDIO records loaded for this report yet.")
```
%% Output
No STDIO records loaded for this report yet.
%% Cell type:markdown id: tags:
Additional records then can be loaded selectively, for example, on a per module basis:
Darshan log data is routinely aggregated for quick overview. The report object offers a few methods to perform common aggregations:
%% Cell type:markdown id: tags:
Report aggregations and summarization remains**experimental** for now, mostly to allow interfaces to stabilize. But experimental features can be switched on easily by invoking `darshan.enable_experimental()`:
Report aggregations and summarization remains**experimental** for now, mostly to allow interfaces to stabilize. But experimental features can be switched on easily by invoking `darshan.enable_experimental()`:
%% Cell type:code id: tags:
``` python
import darshan
darshan.enable_experimental(verbose=True) # Enable verbosity, listing new functionality
```
%% Output
Added method create_time_summary to DarshanReport.
Added method print_module_records to DarshanReport.
Added method summarize to DarshanReport.
Added method merge to DarshanReport.
Added method create_timeline to DarshanReport.
Added method records_as_dict to DarshanReport.
Added method reduce to DarshanReport.
Added method agg_ioops to DarshanReport.
Added method create_sankey to DarshanReport.
Added method filter to DarshanReport.
Added method mod_agg_iohist to DarshanReport.
Added method name_records_summary to DarshanReport.
%% Cell type:code id: tags:
``` python
# Example report, which counts records in log across modules
print("IOOPS have not been aggregated for this report.")
```
%% Output
IOOPS have not been aggregated for this report.
%% Cell type:code id: tags:
``` python
report.read_all()
report.summarize()
```
%% Cell type:code id: tags:
``` python
report.summary['agg_ioops']
```
%% Output
{'MPI-IO': {'MPIIO_INDEP_OPENS': 0,
'MPIIO_COLL_OPENS': 2048,
'MPIIO_INDEP_READS': 0,
'MPIIO_INDEP_WRITES': 18,
'MPIIO_COLL_READS': 0,
'MPIIO_COLL_WRITES': 16384,
'MPIIO_SPLIT_READS': 0,
'MPIIO_SPLIT_WRITES': 0,
'MPIIO_NB_READS': 0,
'MPIIO_NB_WRITES': 0,
'MPIIO_SYNCS': 0,
'MPIIO_HINTS': 0,
'MPIIO_VIEWS': 32768,
'MPIIO_MODE': 9,
'MPIIO_BYTES_READ': 0,
'MPIIO_BYTES_WRITTEN': 2199023259968,
'MPIIO_RW_SWITCHES': 0,
'MPIIO_MAX_READ_TIME_SIZE': 0,
'MPIIO_MAX_WRITE_TIME_SIZE': 134217728,
'MPIIO_SIZE_READ_AGG_0_100': 0,
'MPIIO_SIZE_READ_AGG_100_1K': 0,
'MPIIO_SIZE_READ_AGG_1K_10K': 0,
'MPIIO_SIZE_READ_AGG_10K_100K': 0,
'MPIIO_SIZE_READ_AGG_100K_1M': 0,
'MPIIO_SIZE_READ_AGG_1M_4M': 0,
'MPIIO_SIZE_READ_AGG_4M_10M': 0,
'MPIIO_SIZE_READ_AGG_10M_100M': 0,
'MPIIO_SIZE_READ_AGG_100M_1G': 0,
'MPIIO_SIZE_READ_AGG_1G_PLUS': 0,
'MPIIO_SIZE_WRITE_AGG_0_100': 4,
'MPIIO_SIZE_WRITE_AGG_100_1K': 14,
'MPIIO_SIZE_WRITE_AGG_1K_10K': 0,
'MPIIO_SIZE_WRITE_AGG_10K_100K': 0,
'MPIIO_SIZE_WRITE_AGG_100K_1M': 0,
'MPIIO_SIZE_WRITE_AGG_1M_4M': 0,
'MPIIO_SIZE_WRITE_AGG_4M_10M': 0,
'MPIIO_SIZE_WRITE_AGG_10M_100M': 0,
'MPIIO_SIZE_WRITE_AGG_100M_1G': 16384,
'MPIIO_SIZE_WRITE_AGG_1G_PLUS': 0,
'MPIIO_ACCESS1_ACCESS': 134217728,
'MPIIO_ACCESS2_ACCESS': 272,
'MPIIO_ACCESS3_ACCESS': 544,
'MPIIO_ACCESS4_ACCESS': 328,
'MPIIO_ACCESS1_COUNT': 16384,
'MPIIO_ACCESS2_COUNT': 8,
'MPIIO_ACCESS3_COUNT': 2,
'MPIIO_ACCESS4_COUNT': 2,
'MPIIO_FASTEST_RANK': 597,
'MPIIO_FASTEST_RANK_BYTES': 1073741824,
'MPIIO_SLOWEST_RANK': 1312,
'MPIIO_SLOWEST_RANK_BYTES': 1073741824},
'MPI-IO_indep_simple': {'Read': 0,
'Write': 18,
'Open': 0,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0},
'MPI-IO_coll_simple': {'Read': 0,
'Write': 16384,
'Open': 2048,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0},
'POSIX': {'POSIX_OPENS': 2049,
'POSIX_FILENOS': -1,
'POSIX_DUPS': -1,
'POSIX_READS': 0,
'POSIX_WRITES': 16402,
'POSIX_SEEKS': 16404,
'POSIX_STATS': 0,
'POSIX_MMAPS': 0,
'POSIX_FSYNCS': 0,
'POSIX_FDSYNCS': 0,
'POSIX_RENAME_SOURCES': -1,
'POSIX_RENAME_TARGETS': -1,
'POSIX_RENAMED_FROM': 0,
'POSIX_MODE': 0,
'POSIX_BYTES_READ': 0,
'POSIX_BYTES_WRITTEN': 2199023259968,
'POSIX_MAX_BYTE_READ': 0,
'POSIX_MAX_BYTE_WRITTEN': 2199023261831,
'POSIX_CONSEC_READS': 0,
'POSIX_CONSEC_WRITES': 0,
'POSIX_SEQ_READS': 0,
'POSIX_SEQ_WRITES': 16384,
'POSIX_RW_SWITCHES': 0,
'POSIX_MEM_NOT_ALIGNED': 0,
'POSIX_MEM_ALIGNMENT': 8,
'POSIX_FILE_NOT_ALIGNED': 16401,
'POSIX_FILE_ALIGNMENT': 1048576,
'POSIX_MAX_READ_TIME_SIZE': 0,
'POSIX_MAX_WRITE_TIME_SIZE': 134217728,
'POSIX_SIZE_READ_0_100': 0,
'POSIX_SIZE_READ_100_1K': 0,
'POSIX_SIZE_READ_1K_10K': 0,
'POSIX_SIZE_READ_10K_100K': 0,
'POSIX_SIZE_READ_100K_1M': 0,
'POSIX_SIZE_READ_1M_4M': 0,
'POSIX_SIZE_READ_4M_10M': 0,
'POSIX_SIZE_READ_10M_100M': 0,
'POSIX_SIZE_READ_100M_1G': 0,
'POSIX_SIZE_READ_1G_PLUS': 0,
'POSIX_SIZE_WRITE_0_100': 4,
'POSIX_SIZE_WRITE_100_1K': 14,
'POSIX_SIZE_WRITE_1K_10K': 0,
'POSIX_SIZE_WRITE_10K_100K': 0,
'POSIX_SIZE_WRITE_100K_1M': 0,
'POSIX_SIZE_WRITE_1M_4M': 0,
'POSIX_SIZE_WRITE_4M_10M': 0,
'POSIX_SIZE_WRITE_10M_100M': 0,
'POSIX_SIZE_WRITE_100M_1G': 16384,
'POSIX_SIZE_WRITE_1G_PLUS': 0,
'POSIX_STRIDE1_STRIDE': 274743689216,
'POSIX_STRIDE2_STRIDE': 274743691264,
'POSIX_STRIDE3_STRIDE': 0,
'POSIX_STRIDE4_STRIDE': 0,
'POSIX_STRIDE1_COUNT': 10240,
'POSIX_STRIDE2_COUNT': 4096,
'POSIX_STRIDE3_COUNT': 0,
'POSIX_STRIDE4_COUNT': 0,
'POSIX_ACCESS1_ACCESS': 134217728,
'POSIX_ACCESS2_ACCESS': 272,
'POSIX_ACCESS3_ACCESS': 544,
'POSIX_ACCESS4_ACCESS': 328,
'POSIX_ACCESS1_COUNT': 16384,
'POSIX_ACCESS2_COUNT': 8,
'POSIX_ACCESS3_COUNT': 2,
'POSIX_ACCESS4_COUNT': 2,
'POSIX_FASTEST_RANK': 597,
'POSIX_FASTEST_RANK_BYTES': 1073741824,
'POSIX_SLOWEST_RANK': 1312,
'POSIX_SLOWEST_RANK_BYTES': 1073741824},
'POSIX_simple': {'Read': 0,
'Write': 16402,
'Open': 2049,
'Stat': 0,
'Seek': 16404,
'Mmap': 0,
'Fsync': 0},
'STDIO': {'STDIO_OPENS': 129,
'STDIO_FDOPENS': -129,
'STDIO_READS': 0,
'STDIO_WRITES': 74,
'STDIO_SEEKS': 0,
'STDIO_FLUSHES': 0,
'STDIO_BYTES_WRITTEN': 3309,
'STDIO_BYTES_READ': 0,
'STDIO_MAX_BYTE_READ': 0,
'STDIO_MAX_BYTE_WRITTEN': 3307,
'STDIO_FASTEST_RANK': 0,
'STDIO_FASTEST_RANK_BYTES': 0,
'STDIO_SLOWEST_RANK': 0,
'STDIO_SLOWEST_RANK_BYTES': 0},
'STDIO_simple': {'Read': 0,
'Write': 74,
'Open': 129,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0}}
%% Cell type:markdown id: tags:
Or fine grained:
%% Cell type:code id: tags:
``` python
report.mod_agg_iohist("MPI-IO") # to create the histograms
```
%% Output
{'READ_0_100': 0,
'READ_100_1K': 0,
'READ_1K_10K': 0,
'READ_10K_100K': 0,
'READ_100K_1M': 0,
'READ_1M_4M': 0,
'READ_4M_10M': 0,
'READ_10M_100M': 0,
'READ_100M_1G': 0,
'READ_1G_PLUS': 0,
'WRITE_0_100': 4,
'WRITE_100_1K': 14,
'WRITE_1K_10K': 0,
'WRITE_10K_100K': 0,
'WRITE_100K_1M': 0,
'WRITE_1M_4M': 0,
'WRITE_4M_10M': 0,
'WRITE_10M_100M': 0,
'WRITE_100M_1G': 16384,
'WRITE_1G_PLUS': 0}
%% Cell type:code id: tags:
``` python
report.agg_ioops() # to create the combined operation type summary
```
%% Output
{'MPI-IO': {'MPIIO_INDEP_OPENS': 0,
'MPIIO_COLL_OPENS': 2048,
'MPIIO_INDEP_READS': 0,
'MPIIO_INDEP_WRITES': 18,
'MPIIO_COLL_READS': 0,
'MPIIO_COLL_WRITES': 16384,
'MPIIO_SPLIT_READS': 0,
'MPIIO_SPLIT_WRITES': 0,
'MPIIO_NB_READS': 0,
'MPIIO_NB_WRITES': 0,
'MPIIO_SYNCS': 0,
'MPIIO_HINTS': 0,
'MPIIO_VIEWS': 32768,
'MPIIO_MODE': 9,
'MPIIO_BYTES_READ': 0,
'MPIIO_BYTES_WRITTEN': 2199023259968,
'MPIIO_RW_SWITCHES': 0,
'MPIIO_MAX_READ_TIME_SIZE': 0,
'MPIIO_MAX_WRITE_TIME_SIZE': 134217728,
'MPIIO_SIZE_READ_AGG_0_100': 0,
'MPIIO_SIZE_READ_AGG_100_1K': 0,
'MPIIO_SIZE_READ_AGG_1K_10K': 0,
'MPIIO_SIZE_READ_AGG_10K_100K': 0,
'MPIIO_SIZE_READ_AGG_100K_1M': 0,
'MPIIO_SIZE_READ_AGG_1M_4M': 0,
'MPIIO_SIZE_READ_AGG_4M_10M': 0,
'MPIIO_SIZE_READ_AGG_10M_100M': 0,
'MPIIO_SIZE_READ_AGG_100M_1G': 0,
'MPIIO_SIZE_READ_AGG_1G_PLUS': 0,
'MPIIO_SIZE_WRITE_AGG_0_100': 4,
'MPIIO_SIZE_WRITE_AGG_100_1K': 14,
'MPIIO_SIZE_WRITE_AGG_1K_10K': 0,
'MPIIO_SIZE_WRITE_AGG_10K_100K': 0,
'MPIIO_SIZE_WRITE_AGG_100K_1M': 0,
'MPIIO_SIZE_WRITE_AGG_1M_4M': 0,
'MPIIO_SIZE_WRITE_AGG_4M_10M': 0,
'MPIIO_SIZE_WRITE_AGG_10M_100M': 0,
'MPIIO_SIZE_WRITE_AGG_100M_1G': 16384,
'MPIIO_SIZE_WRITE_AGG_1G_PLUS': 0,
'MPIIO_ACCESS1_ACCESS': 134217728,
'MPIIO_ACCESS2_ACCESS': 272,
'MPIIO_ACCESS3_ACCESS': 544,
'MPIIO_ACCESS4_ACCESS': 328,
'MPIIO_ACCESS1_COUNT': 16384,
'MPIIO_ACCESS2_COUNT': 8,
'MPIIO_ACCESS3_COUNT': 2,
'MPIIO_ACCESS4_COUNT': 2,
'MPIIO_FASTEST_RANK': 597,
'MPIIO_FASTEST_RANK_BYTES': 1073741824,
'MPIIO_SLOWEST_RANK': 1312,
'MPIIO_SLOWEST_RANK_BYTES': 1073741824},
'MPI-IO_indep_simple': {'Read': 0,
'Write': 18,
'Open': 0,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0},
'MPI-IO_coll_simple': {'Read': 0,
'Write': 16384,
'Open': 2048,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0},
'POSIX': {'POSIX_OPENS': 2049,
'POSIX_FILENOS': -1,
'POSIX_DUPS': -1,
'POSIX_READS': 0,
'POSIX_WRITES': 16402,
'POSIX_SEEKS': 16404,
'POSIX_STATS': 0,
'POSIX_MMAPS': 0,
'POSIX_FSYNCS': 0,
'POSIX_FDSYNCS': 0,
'POSIX_RENAME_SOURCES': -1,
'POSIX_RENAME_TARGETS': -1,
'POSIX_RENAMED_FROM': 0,
'POSIX_MODE': 0,
'POSIX_BYTES_READ': 0,
'POSIX_BYTES_WRITTEN': 2199023259968,
'POSIX_MAX_BYTE_READ': 0,
'POSIX_MAX_BYTE_WRITTEN': 2199023261831,
'POSIX_CONSEC_READS': 0,
'POSIX_CONSEC_WRITES': 0,
'POSIX_SEQ_READS': 0,
'POSIX_SEQ_WRITES': 16384,
'POSIX_RW_SWITCHES': 0,
'POSIX_MEM_NOT_ALIGNED': 0,
'POSIX_MEM_ALIGNMENT': 8,
'POSIX_FILE_NOT_ALIGNED': 16401,
'POSIX_FILE_ALIGNMENT': 1048576,
'POSIX_MAX_READ_TIME_SIZE': 0,
'POSIX_MAX_WRITE_TIME_SIZE': 134217728,
'POSIX_SIZE_READ_0_100': 0,
'POSIX_SIZE_READ_100_1K': 0,
'POSIX_SIZE_READ_1K_10K': 0,
'POSIX_SIZE_READ_10K_100K': 0,
'POSIX_SIZE_READ_100K_1M': 0,
'POSIX_SIZE_READ_1M_4M': 0,
'POSIX_SIZE_READ_4M_10M': 0,
'POSIX_SIZE_READ_10M_100M': 0,
'POSIX_SIZE_READ_100M_1G': 0,
'POSIX_SIZE_READ_1G_PLUS': 0,
'POSIX_SIZE_WRITE_0_100': 4,
'POSIX_SIZE_WRITE_100_1K': 14,
'POSIX_SIZE_WRITE_1K_10K': 0,
'POSIX_SIZE_WRITE_10K_100K': 0,
'POSIX_SIZE_WRITE_100K_1M': 0,
'POSIX_SIZE_WRITE_1M_4M': 0,
'POSIX_SIZE_WRITE_4M_10M': 0,
'POSIX_SIZE_WRITE_10M_100M': 0,
'POSIX_SIZE_WRITE_100M_1G': 16384,
'POSIX_SIZE_WRITE_1G_PLUS': 0,
'POSIX_STRIDE1_STRIDE': 274743689216,
'POSIX_STRIDE2_STRIDE': 274743691264,
'POSIX_STRIDE3_STRIDE': 0,
'POSIX_STRIDE4_STRIDE': 0,
'POSIX_STRIDE1_COUNT': 10240,
'POSIX_STRIDE2_COUNT': 4096,
'POSIX_STRIDE3_COUNT': 0,
'POSIX_STRIDE4_COUNT': 0,
'POSIX_ACCESS1_ACCESS': 134217728,
'POSIX_ACCESS2_ACCESS': 272,
'POSIX_ACCESS3_ACCESS': 544,
'POSIX_ACCESS4_ACCESS': 328,
'POSIX_ACCESS1_COUNT': 16384,
'POSIX_ACCESS2_COUNT': 8,
'POSIX_ACCESS3_COUNT': 2,
'POSIX_ACCESS4_COUNT': 2,
'POSIX_FASTEST_RANK': 597,
'POSIX_FASTEST_RANK_BYTES': 1073741824,
'POSIX_SLOWEST_RANK': 1312,
'POSIX_SLOWEST_RANK_BYTES': 1073741824},
'POSIX_simple': {'Read': 0,
'Write': 16402,
'Open': 2049,
'Stat': 0,
'Seek': 16404,
'Mmap': 0,
'Fsync': 0},
'STDIO': {'STDIO_OPENS': 129,
'STDIO_FDOPENS': -129,
'STDIO_READS': 0,
'STDIO_WRITES': 74,
'STDIO_SEEKS': 0,
'STDIO_FLUSHES': 0,
'STDIO_BYTES_WRITTEN': 3309,
'STDIO_BYTES_READ': 0,
'STDIO_MAX_BYTE_READ': 0,
'STDIO_MAX_BYTE_WRITTEN': 3307,
'STDIO_FASTEST_RANK': 0,
'STDIO_FASTEST_RANK_BYTES': 0,
'STDIO_SLOWEST_RANK': 0,
'STDIO_SLOWEST_RANK_BYTES': 0},
'STDIO_simple': {'Read': 0,
'Write': 74,
'Open': 129,
'Stat': 0,
'Seek': 0,
'Mmap': 0,
'Fsync': 0}}
%% Cell type:markdown id: tags:
### Report Algebra (Experimental)
Various operations are implemented to merge, combine and manipulate log records. This is useful for analysis task, but can also be used to construct performance projections or extrapolation.
For convienience, we overload some of the operations provided by Python when they resemble intuitive equivalence to their mathematical counterparts. In particular, we enable the combination of different object types.
%% Cell type:code id: tags:
``` python
import darshan
darshan.enable_experimental()
```
%% Cell type:code id: tags:
``` python
# merging records
from darshan.experimental.plots.matplotlib import plot_access_histogram
from darshan.experimental.plots.matplotlib import plot_opcounts