concurrent I/O from threads gets counted twice in timing
If two threads (in the same MPI process) access the same file concurrently, then the cumulative time counters are incremented too far.
We need to add a reference count to the run-time data structure to tell how many threads are accessing the same file at once. The time should not be incremented until the reference counter hits zero.
This does not require a log format change.