README.md 3 KB
Newer Older
1
# Mochi Boot Camp: RPC profiling hands-on example
2

3 4
This directory contains a hands-on example of how to perform RPC-level
profiling of Mochi services.
5

6 7 8 9 10 11 12 13 14
## Prerequisites

Begin by running the initial example (the "sum" service, executed from an
interactive job) from `mochi-boot-camp/ecp-am-2020/sessions/hands-on/`.
This is the example service that we will instrument to demonstrate use of
the profiling tool chain.

## Concepts

Philip Carns's avatar
Philip Carns committed
15 16 17 18 19
The first step in collecting an Mochi RPC profile is to enable RPC
instrumentation at runtime.  There are some programmatic ways to do this
(see the Margo documentation for details), but the most straightforward
method is to simply set the `MARGO_ENABLE_PROFILING` environment variable
for all participating processes, including both clients and servers.
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

Margo will detect this environment variable and enable profiling.  In this
mode, data is collected over the duration of execution and then written to a
the current working directory for each process as it exits.

To use profiling in this mode, you must run your processes from a directory
that you have write access to.  Note that on Summit, home directories are
not writeable from compute nodes.  A more appropriate working directory
might be `/gpfs/alpine/<project>/scratch/<user>/`

Once your job is complete, you can then use a python script on the login
node to produce a summary pdf containing RPC performance graphs.  You
probably need to scp this pdf to your local machine to view it.

## Hands-on example steps

Follow the login, compilation, and job submission instructions from the
top-level hands-on example ("sum").  Once you have an interactive terminal,
do the following:
39 40

```
41 42 43
# cd to a writeable directory (as in the following example, except replace
#   the csc332 project name and carns user name
$ cd /gpfs/alpine/csc332/scratch/carns/
44

45 46 47
# run the same example as before, but with additional environment variables
#   set using the -E option to jsrun:
$ jsrun -n 1 -r 1 -g ALL_GPUS -E MARGO_ENABLE_PROFILING=1 ./server &
48

49 50 51
# don't forget to change the address argument for the client to be the
#   address string emmitted by the server
$ jsrun -n 1 -r 1 -g ALL_GPUS -E MARGO_ENABLE_PROFILING=1 ./client 'ofi+verbs;ofi_rxm://10.41.0.103:49201'
52

53 54 55 56 57 58 59 60 61
# exit job and return to login node
$ exit
```

On the login node:

```
# return to the directory that you were in when running the above example:
$ cd /gpfs/alpine/csc332/scratch/carns/
62

63 64 65 66 67
# load Python (Python 2 for now):
$ module load python/2.7.15-anaconda2-5.3.0

# generate a profile
$ margo-gen-profile
68 69
```

70 71 72 73 74 75 76 77 78 79 80
At this point you should have a file named `profile.pdf` that can be copied
to another computer and viewed.  There are multiple pages with different
plots comparing RPC performance.

This example is unusual in that there is only one RPC type observed (called
`sum`), and it was only executed 4 times.  This same work flow should work
for any Mochi example, however.

For more information see the [Margo instrumentation documentation](
https://xgitlab.cels.anl.gov/sds/margo/blob/master/doc/instrumentation.md).