README.md 727 Bytes
Newer Older
Valentin Reis's avatar
Valentin Reis committed
1
2
hbandit
=======
Valentin Reis's avatar
Valentin Reis committed
3

Valentin Reis's avatar
Valentin Reis committed
4
Safe multi-armed bandit implementations:
Valentin Reis's avatar
Valentin Reis committed
5

Valentin Reis's avatar
Valentin Reis committed
6
-   Eps-Greedy (fixed rate, inverse squared rate)
Valentin Reis's avatar
Valentin Reis committed
7
-   Exp3 (hyperparameter-free rate from \[[1](#ref-bubeck2012regret)\])
Valentin Reis's avatar
Valentin Reis committed
8
9
-   Exp4.R \[[2](#ref-sun2017safety)\]

Valentin Reis's avatar
Valentin Reis committed
10
11
12
13
14
documentation
-------------

      nix-build /path/to/hbandit/or/url/to/tarball -A hbandit.doc

Valentin Reis's avatar
Valentin Reis committed
15
16
17
18
19
20
21
22
23
<!-- vim: set ft=markdown.pandoc cole=0: -->

\[1\] Bubeck, S. et al. 2012. Regret analysis of stochastic and
nonstochastic multi-armed bandit problems. *Foundations and Trends in
Machine Learning*. 5, 1 (2012), 1–122.

\[2\] Sun, W. et al. 2017. Safety-aware algorithms for adversarial
contextual bandit. *Proceedings of the 34th international conference on
machine learning-volume 70* (2017), 3280–3288.