Commit 3550f000 authored by Philip Carns's avatar Philip Carns
Browse files

update Bake README

parent 6cb1ae9e
# BAKE # Bake
Bake is a microservice (a Mochi provider) for high performance bulk
storage of raw data regions. Bake uses modular backends to store data
on persistent memory, conventional file systems, or other storage media.
See https://www.mcs.anl.gov/research/projects/mochi/ and
https://mochi.readthedocs.io/en/latest/ for more information about Mochi.
Bake's scope is limited exclusively to data storage. Capabilities such as
indexing, name spaces, and sharding must be provided by other microservice
components.
## Installation ## Installation
BAKE can easily be installed using Spack: The easiest way to install Bake is through spack:
`spack install bake` `spack install bake`
This will install BAKE and its dependencies (margo, uuid, and libpmem, This will install BAKE and its dependencies. Please refer to the end of the
as well as their respective dependencies). To compile and install BAKE manually, document for manual compilation instructions.
please refer to the end of this document.
## Architecture ## Architecture
...@@ -16,6 +26,9 @@ Like most Mochi services, BAKE relies on a client/provider architecture. ...@@ -16,6 +26,9 @@ Like most Mochi services, BAKE relies on a client/provider architecture.
A provider, identified by its _address_ and _multiplex id_, manages one or more A provider, identified by its _address_ and _multiplex id_, manages one or more
_BAKE targets_, referenced externally by their _target id_. _BAKE targets_, referenced externally by their _target id_.
A target can be thought of as a storage device. This may be (for example) a
PMDK volume or a local file system.
## Setting up a BAKE target ## Setting up a BAKE target
BAKE requires the backend storage file to be created beforehand using BAKE requires the backend storage file to be created beforehand using
...@@ -24,12 +37,11 @@ BAKE requires the backend storage file to be created beforehand using ...@@ -24,12 +37,11 @@ BAKE requires the backend storage file to be created beforehand using
`bake-mkpool -s 500M /dev/shm/foo.dat` `bake-mkpool -s 500M /dev/shm/foo.dat`
creates a 500 MB file at _/dev/shm/foo.dat_ to be used by BAKE as a target. creates a 500 MB file at _/dev/shm/foo.dat_ to be used by BAKE as a target.
Bake will use the `pmem` (persistent memory) backend by default, which means
The bake-mkpool command creates a BAKE pool used to store raw data for that the underlying file will memory mapped for access usign the PMDK
a particular BAKE target. This is essentially a wrapper command around library. You can also providie an explicit prefix (such as `file:` for the
pmemobj utilities for creating an empty pool that additionally stores conventional file backend or `pmem:` for the persistent memory backend) to
some BAKE-specific metadata in the created pool. Pools used by BAKE dictate a specific target type.
providers must be created using this command.
## Starting a daemon ## Starting a daemon
...@@ -56,8 +68,21 @@ different multiplex ids 1, 2, ... _N_ where _N_ is the number of storage targets ...@@ -56,8 +68,21 @@ different multiplex ids 1, 2, ... _N_ where _N_ is the number of storage targets
to manage. The _targets_ mode indicates that a single provider should be used to to manage. The _targets_ mode indicates that a single provider should be used to
manage all the storage targets. manage all the storage targets.
## Integrating Bake into a larger service
Bake is not intended to be a standalone user-facing service. See
https://mochi.readthedocs.io/en/latest/bedrock.html for guidance on how to
integrate it with other providers using Mochi's Bedrock capability.
## Client API example ## Client API example
Data is stored in `regions` within a `target` using explicit create,
write, and persist operations. The caller cannot dictate the region id
that will be used to reference a region; this identifier is generated
by Bake at creation time. The region size must be specified at creation
time as well; there is no mechanism for extending the size of an existing
region.
```c ```c
#include <bake-client.h> #include <bake-client.h>
...@@ -71,10 +96,10 @@ int main(int argc, char **argv) ...@@ -71,10 +96,10 @@ int main(int argc, char **argv)
uint8_t mplex_id; // multiplex id of the provider uint8_t mplex_id; // multiplex id of the provider
uint32_t target_number; // target to use uint32_t target_number; // target to use
bake_region_id_t rid; // BAKE region id handle bake_region_id_t rid; // BAKE region id handle
bake_target_id_t* bti; // array of target ids bake_target_id_t* bti; // array of target ids
/* ... setup variables ... */ /* ... setup variables ... */
/* Initialize Margo */ /* Initialize Margo */
mid = margo_init(..., MARGO_CLIENT_MODE, 0, -1); mid = margo_init(..., MARGO_CLIENT_MODE, 0, -1);
/* Lookup the server */ /* Lookup the server */
...@@ -109,9 +134,9 @@ int main(int argc, char **argv) ...@@ -109,9 +134,9 @@ int main(int argc, char **argv)
} }
``` ```
Note that a `bake_region_id_t` object can be written (into a file or a socket) Note that a `bake_region_id_t` object is persistent. It can be written
and stored or sent to another program. These region ids are what uniquely (into a file or a socket) and stored or sent to another program. These
reference a region within a given target. region ids are what uniquely reference a region within a given target.
The rest of the client-side API can be found in `bake-client.h`. The rest of the client-side API can be found in `bake-client.h`.
...@@ -122,44 +147,26 @@ attach storage targets to them. The provider-side API is located in ...@@ -122,44 +147,26 @@ attach storage targets to them. The provider-side API is located in
_bake-server.h_, and consists of mainly two functions: _bake-server.h_, and consists of mainly two functions:
```c ```c
int bake_provider_register( int bake_provider_register(margo_instance_id mid,
margo_instance_id mid, uint16_t provider_id,
uint8_t mplex_id, const struct bake_provider_init_info* args,
ABT_pool pool, bake_provider_t* provider);
bake_provider_t* provider);
``` ```
This creates a provider at the given multiplex id, using a given Argobots pool. This creates a provider at the given provider id using the specified margo
instance. The `args` parameter can be used to modify default settings,
including passing in a fully specified json configuration block. See
`bake-server.h` for details.
```c ```c
int bake_provider_add_storage_target( int bake_provider_attach_target(bake_provider_t provider,
bake_provider_t provider, const char* target_name,
const char *target_name, bake_target_id_t* target_id);
bake_target_id_t* target_id);
``` ```
This makes the provider manage the given storage target. This makes the provider manage the given storage target.
Other functions are available to remove a storage target (or all storage Other functions are available to create and detach targets from a provider.
targets) from a provider.
## Latency benchmark execution example
* `./bake-latency-bench sm:///tmp/cci/sm/carns-x1/1/1 100000 4 8`
This example runs a sequence of latency benchmarks. Other utilities
installed with BAKE will perform other rudimentary operations.
The first argument is the address of the server. We are using CCI/SM in this
case, which means that the URL is a path to the connection information of the
server in /tmp. The base (not changeable) portion of the path is
/tmp/cci/HOSTNAME. The remainder of the path was specified by the server
daemon when it started.
The second argument is the number of benchmark iterations.
The third and fourth arguments specify the range of sizes to use for read and
write operations in the benchmark.
## Generic Bake benchmark ## Generic Bake benchmark
...@@ -211,36 +218,18 @@ The following table describes each type of benchmark and their parameters. ...@@ -211,36 +218,18 @@ The following table describes each type of benchmark and their parameters.
| | preregister-bulk | false | Whether to preregister the client's buffer for RDMA | | | preregister-bulk | false | Whether to preregister the client's buffer for RDMA |
| | erase-on-teardown | true | Whether to remove the regions after the benchmark | | | erase-on-teardown | true | Whether to remove the regions after the benchmark |
## Misc tips
Memory allocation seems to account for a significant portion of
the latency as of this writing. The tcmalloc library will lower and
stabilize latency somewhat as a partial short-term solution. On ubuntu
you can use tcmalloc by running this in the terminal before the server
and client commands:
export LD_PRELOAD=/usr/lib/libtcmalloc.so.4
## Manual installation ## Manual installation
BAKE depends on the following libraries: BAKE depends on the following libraries:
* uuid (install uuid-dev package on ubuntu) * uuid (install uuid-dev package on ubuntu)
* NVML/libpmem (see instructions below) * PMDK (see instructions below)
* margo (see instructions [here](https://xgitlab.cels.anl.gov/sds/margo)) * json-c
* mochi-abt-io
Yu can compile and install the latest git revision of NVML as follows: * mochi-margo
* `git clone https://github.com/pmem/nvml.git`
* `cd nvml`
* `make`
* `make install prefix=/home/carns/working/install/`
`make install` will require `pandoc` or you can ignore the error and not build
the documentation.
To compile BAKE: Bake will automatically identify these dependencies at configure time using
pkg-config. To compile BAKE:
* `./prepare.sh` * `./prepare.sh`
* `mkdir build` * `mkdir build`
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment