Commit e73e7b37 authored by Caitlin Ross's avatar Caitlin Ross
Browse files

Update README-vis.md

parent 4d25ba0e
## README for using ROSS instrumentation in CODES ## README for using ROSS instrumentation with CODES
For details about the ROSS instrumentation, see the [ROSS Instrumentation blog post](http://carothersc.github.io/ROSS/feature/instrumentation.html) on the ROSS webpage. For details about the ROSS instrumentation, see the [ROSS Instrumentation blog post](http://carothersc.github.io/ROSS/feature/instrumentation.html)
The instrumentation will be merged into the master branch of the ROSS repo very soon. on the ROSS webpage.
There are currently 3 types of instrumentation: GVT-based, real time, and event tracing. See the ROSS documentation for more info on There are currently 3 types of instrumentation: GVT-based, real time, and event tracing. See the ROSS documentation for more info on
the specific options or use `--help` with your model. The GVT-based and real time sampling do not require any changes to your model code. the specific options or use `--help` with your model. To collect data about the simulation engine, no changes are needed to model code
The event tracing will run without any changes, but some additions to the model code is needed in order to get specific model event types. for any of the instrumentation modes. Some additions to the model code is needed in order to turn on any model-level data collection.
This document describes how to do it. See the "Model-level data sampling" section on [ROSS Instrumentation blog post](http://carothersc.github.io/ROSS/feature/instrumentation.html).
Here we describe CODES specific details.
### Register LP event tracing function ### Register Instrumentation Callback Functions
The examples here are based on the server LP for the synthetic workload generation for dragonfly (`src/network-workloads/model-net-synthetic.c`). The examples here are based on the dragonfly router and terminal LPs (`src/networks/model-net/dragonfly.c`).
As described in the ROSS Vis documentation, we need to first add our function that will save the event type (and any other desired data) to the As described in the ROSS Vis documentation, we need to create a `st_model_types` struct with the pointer and size information.
buffer location provided by ROSS.
```C ```C
void svr_event_collect(svr_msg *m, tw_lp *lp, char *buffer) st_model_types dragonfly_model_types[] = {
{ {(rbev_trace_f) dragonfly_event_collect,
int type = (int) m->svr_event_type;
memcpy(buffer, &type, sizeof(type));
}
```
Then we need to create a `st_trace_type` struct with the pointer and size information.
```C
st_trace_type svr_trace_types[] = {
{(rbev_trace_f) svr_event_collect,
sizeof(int), sizeof(int),
(ev_trace_f) svr_event_collect, (ev_trace_f) dragonfly_event_collect,
sizeof(int)}, sizeof(int),
(model_stat_f) dragonfly_model_stat_collect,
sizeof(tw_lpid) + sizeof(long) * 2 + sizeof(double) + sizeof(tw_stime) * 2},
{(rbev_trace_f) dragonfly_event_collect,
sizeof(int),
(ev_trace_f) dragonfly_event_collect,
sizeof(int),
(model_stat_f) dfly_router_model_stat_collect,
0}, // updated in router_setup()
{0} {0}
} }
``` ```
`dragonfly_model_types[0]` is the function pointers for the terminal LP and `dragonfly_model_types[1]` is for the router LP.
For the first two function pointers for each LP, we use the same `dragonfly_event_collec()` because right now we just collect the event type, so
it's the same for both of these LPs. You can change these if you want to use different functions for different LP types or if you want a different
function for the full event tracing than that used for the rollback event trace (`rbev_trace_f` is for the event tracing of rollback triggering events only,
while `ev_trace_f` is for the full event tracing).
The number following each function pointer is the size of the data that will be saved when the function is called.
The third pointer is for the data to be sampled at the GVT or real time sampling points.
In this case the LPs have different function pointers since we want to collect different types of data for the two LP types.
For the terminal, I set the appropriate size of the data to be collected, but for the router, the size of the data is dependent on the radix for the
dragonfly configuration being used, which isn't known until runtime.
*Note*: You can only reuse the function for event tracing for LPs that use the same type of message struct.
For example, the dragonfly terminal and router LPs both use the `terminal_message` struct, so they can
use the same functions for event tracing. However the model net base LP uses the `model_net_wrap_msg` struct, so it gets its own event collection function and
`st_trace_type` struct, in order to read the event type correctly from the model.
And a function to return this struct In the ROSS instrumentation documentation, there are two methods provided for letting ROSS know about these `st_model_types` structs.
In CODES, this step is a little different, as `codes_mapping_setup()` calls `tw_lp_settype()`.
Instead, you add a function to return this struct for each of your LP types:
```C ```C
static const st_trace_type *svr_get_trace_types(void) static const st_model_types *dragonfly_get_model_types(void)
{
return(&dragonfly_model_types[0]);
}
static const st_model_types *dfly_router_get_model_types(void)
{ {
return(&svr_trace_types[0]); return(&dragonfly_model_types[1]);
} }
``` ```
As a reminder, there are two types of event tracing the full event trace (`ev_trace_f`) or only events that trigger rollbacks (`rbev_trace_f`). Now you need to add register functions for CODES:
It's set up so that you can have different functions for both types of event tracing, or you can use the same function for both.
Immediately after each function pointer is a `size_t` type that takes the amount of data that the function will be placing in the buffer,
so ROSS can appropriately handle things.
If you have multiple LPs, you can do a `st_trace_type` for each LP, or you can reuse. *Note*: You can only reuse `st_trace_type` and the event type collection
function for LPs that use the same type of message struct. For example, the dragonfly terminal and router LPs both use the `terminal_message` struct, so they can
use the same functions for event tracing. However the model net base LP uses the `model_net_wrap_msg` struct, so it gets its own event collection function and
`st_trace_type` struct, in order to read the event type correctly from the model.
`codes_mapping_init()` was changed to register the function pointers when it is setting up the LP types. So for CODES models, you need to add a register function:
```C ```C
void svr_register_trace() static void dragonfly_register_model_types(st_model_types *base_type)
{ {
trace_type_register("server", svr_get_trace_types()); st_model_type_register(LP_CONFIG_NM_TERM, base_type);
} }
```
`trace_type_register(const char* name, const st_trace_type* type)` is part of the API and lets CODES know the pointers for LP initialization.
Now in the main function, you call the register function *before* calling `codes_mapping_setup()`. static void router_register_model_types(st_model_types *base_type)
```C {
if (g_st_ev_trace) st_model_type_register(LP_CONFIG_NM_ROUT, base_type);
svr_register_trace(); }
``` ```
`st_model_type_register(const char* name, const st_trace_type* type)` is part of the CODES API and lets CODES know the pointers for LP initialization.
`g_st_ev_trace` is a ROSS flag for determining if event tracing is turned on. At this point, there are two different steps to follow depending on whether the model is one of the model-net models or not.
That's all you need to add for each LP. ##### Model-net Models
In the `model_net_method` struct, two fields have been added: `mn_model_stat_register` and `mn_get_model_stat_types`.
### Model Net LPs You need to set these to the functions described above. For example:
In addition to the dragonfly synthetic server LP, I've already added in the necessary changes for both the model net base LP type and dragonfly (both router and terminal LPs),
so no other changes need to be made to those LPs. (Unless you want to collect some additional data.)
For any other network LPs that are based on the model net base LP type, there are a few additional details to know.
There are two fields added to the `model_net_method` struct for pointers to the trace registration functions for each LP.
```C ```C
void (*mn_trace_register)(st_trace_type *base_type); struct model_net_method dragonfly_method =
const st_trace_type* (*mn_get_trace_type)(); {
``` .mn_configure = dragonfly_configure,
// ... all the usual model net stuff
.mn_model_stat_register = dragonfly_register_model_types,
.mn_get_model_stat_types = dragonfly_get_model_types,
};
For example, right now, both the dragonfly router and terminal LPs use the same `st_trace_type dragonfly_trace_types` struct and the following function to return its pointer: struct model_net_method dragonfly_router_method =
```C
static const st_trace_type *dragonfly_get_trace_types(void)
{ {
return(&dragonfly_trace_types[0]); .mn_configure = NULL,
} // ... all the usual model net stuff
.mn_model_stat_register = router_register_model_types,
.mn_get_model_stat_types = dfly_router_get_model_types,
};
``` ```
They have different register functions: ##### All other CODES models
Using the synthetic workload LP for dragonfly as an example (`src/network-workloads/model-net-synthetic.c`).
In the main function, you call the register function *before* calling `codes_mapping_setup()`.
```C ```C
static void dragonfly_register_trace(st_trace_type *base_type) st_model_types svr_model_types[] = {
{ {(rbev_trace_f) svr_event_collect,
trace_type_register(LP_CONFIG_NM_TERM, base_type); sizeof(int),
(ev_trace_f) svr_event_collect,
sizeof(int),
(model_stat_f) svr_model_stat_collect,
0}, // at the moment, we're not actually collecting any data about this LP
{0}
} }
static void router_register_trace(st_trace_type *base_type) static void svr_register_model_types()
{ {
trace_type_register(LP_CONFIG_NM_ROUT, base_type); st_model_type_register("server", &svr_model_types[0]);
} }
```
And then the following additions to their `model_net_method` structs: int main(int argc, char **argv)
```C
struct model_net_method dragonfly_method =
{ {
// the fields already in the struct // ... some set up removed for brevity
...
// event tracing additions model_net_register();
.mn_trace_register = dragonfly_register_trace, svr_add_lp_type();
.mn_get_trace_type = dragonfly_get_trace_types,
}; if (g_st_ev_trace || g_st_model_stats)
svr_register_model_types();
struct model_net_method dragonfly_router_method =
{ codes_mapping_setup();
// the fields already in the struct
... //...
// event tracing additions }
.mn_trace_register = router_register_trace,
.mn_get_trace_type = dragonfly_get_trace_types,
};
``` ```
Any other LPs built off of the model net LP, can be changed in the same way. `g_st_ev_trace` is a ROSS flag for determining if event tracing is turned on and `g_st_model_stats` determines if the GVT-based or real time instrumentation
modes are collecting model-level data as well.
### CODES LPs that have event type collection implemented: ### CODES LPs that currently have event type collection implemented:
- nw-lp (model-net-mpi-replay.c) - nw-lp (model-net-mpi-replay.c)
- original dragonfly router and terminal LPs (dragonfly.c) - original dragonfly router and terminal LPs (dragonfly.c)
- dfly server LP (model-net-synthetic.c) - dfly server LP (model-net-synthetic.c)
- fat tree terminal and switch LPs (fattree.c)
- model-net-base-lp (model-net-lp.c) - model-net-base-lp (model-net-lp.c)
- fat tree server LP (model-net-synthetic-fattree.c) - fat tree server LP (model-net-synthetic-fattree.c)
The fat-tree terminal and switch LPs (fattree.c) are only partially implemented at the moment. It needs two `model_net_method` structs to fully implement,
but currently both terminal and switch LPs use the same `fattree_method` struct.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment