The synthetic IO language is a simple, interpreted set of IO and basic
arithmetic commands meant to simplify the specification and running of
The input for the workload generator consists of an IO kernel metadata file and
a number of IO kernel files. The former specifies a set of kernel files to run
and logical client IDs to participate in the workload, while the latter
describes the IO to be performed.
The format of the metadata file is a set of lines containing:
is the ID of this group (see restrictions)
and form the range of logical client IDs that will
perform the given workload. Note that the end ID is inclusive, so a start,
end pair of 0, 3 will include IDs 0, 1, 2, and 3. An of -1 indicates
to use the remaining number of clients as specified by the user.
is the path to the IO kernel workload. It may either be an
absolute or relative path.
The IO kernel file contains a set of commands performed on a per-client
basis. Like the workload generator interface, files are represented by integer
IDs, and the standard set of "POSIX-ish" operations can be applied (e.g., open,
close, sync, write, read) and have a similar argument list (file ID, [length],
[offset] where applicable). pread/pwrite equivalents are given by
More detailed documentation on the language is ongoing, but for now a general
example can be seen at doc/workload, which shows a simple out-of-core data
shuffle. Braver souls may wish to visit the implementation at src/iokernellang
The following restrictions currently apply to the IO language:
all user-defined variables must be a single, lower-case letter (the symbol
table from the code we inherited is an array of 26 chars)
the implementation of "groups" is currently broken. We have gotten around
this by hard-coding in the group size and client ID into the parser when a
kernel file is loaded (parsing currently occurs on a per-client basis).
Hence, getgroupid should be completely ignored and getgrouprank and
getgroupsize ignore the group ID parameter passed in.
The IO language is frozen and no future development will be happening with it,
so keep the following limitations in mind when using it.
There is currently no way to specify a "create" flag to open.
Variables are expected to be a single lowercase character.
"Mock" IO workload
The mock IO workload generator creates a sequential workload of N requests of
size M. The generated file ID is either an optional input or 0 - there's also
an option to add a (simulated) processes rank to the file IDs, giving in effect
a unique file per rank. Relevant configuration parameters are:
mock_num_requests - the number of requests
mock_request_size - the size of each request
mock_request_type - the type of request ("read" or "write")
mock_file_id (optional) - the file ID to use, default 0
mock_use_unique_file_ids (optional) - if non-zero, add the workload
processor's rank to the file ID. Default is 0.
mock_rank_table_size (optional) - the hash table size to store the ranks in.
For minimal collisions, choose a value larger than the expected workload
number of ranks.
Recorder IO workload
Recorder has both a static and a dynamic library that may be linked to a given application (preloaded at runtime in the case of the dynamic library). Whenever an MPI process calls an I/O function that is instrumented at a specific layer of the I/O stack by Recorder, the timestamp, function name, arguments, return value, and the duration of the function are stored into a per-process trace file.
For more details, see "Techniques for Modeling Large-Scale HPC I/O Workloads" by S. Snyder et al. in PMBS workshop held in conjunction with SC'15.
Checkpoint IO workload
The checkpoint I/O workload is based on the optimum checkpointing interval for applications to minimize both the time spent writing checkpoints and the time recomputing lost work due to failure, given the system’s expected mean time to failure (MTTF), the amount of data to be checkpointed, and the available storage bandwidth. The amount of data to be checkpointed, storage/network bandwidth and MTTF can be configured by the user in the config files. See codes-storage-server repo (README) for more details on how to use the checkpoint IO workload.