Abstract
We present a provably optimal differentially private algorithm for the stochastic multi-arm bandit problem, as opposed to the private analogue of the UCB-algorithm (Mishra and Thakurta, 2015; Tossou and Dimitrakakis, 2016) which doesn't meet the recently discovered lower-bound of ω(Klog(T)/ϵ) (Shariff and Sheffet, 2018). Our construction is based on a different algorithm, Successive Elimination (Even-Dar et al., 2002), that repeatedly pulls all remaining arms until an arm is found to be suboptimal and is then eliminated. In order to devise a private analogue of Successive Elimination we visit the problem of private stopping rule, that takes as input a stream of i.i.d samples from an unknown distribution and returns a multiplicative (1 ± α)-approximation of the distribution's mean, and prove the optimality of our private stopping rule. We then present the private Successive Elimination algorithm which meets both the non-private lower bound (Lai and Robbins, 1985) and the above-mentioned private lower bound. We also compare empirically the performance of our algorithm with the private UCB algorithm.
| Original language | English |
|---|---|
| Title of host publication | 36th International Conference on Machine Learning, ICML 2019 |
| Publisher | International Machine Learning Society (IMLS) |
| Pages | 9791-9800 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781510886988 |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States Duration: 9 Jun 2019 → 15 Jun 2019 |
Publication series
| Name | 36th International Conference on Machine Learning, ICML 2019 |
|---|---|
| Volume | 2019-June |
Conference
| Conference | 36th International Conference on Machine Learning, ICML 2019 |
|---|---|
| Country/Territory | United States |
| City | Long Beach |
| Period | 9/06/19 → 15/06/19 |
Bibliographical note
Publisher Copyright:© 2019 International Machine Learning Society (IMLS).
Funding
We gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSBRC) for supporting O.S. with grant #201706701. O.S. is also an unpaid collaborator on NSF grant #1565387. We thank the anonymous referee for helpful advice as to simplifying our original version of the DP-SR algorithm.
| Funders | Funder number |
|---|---|
| NSBRC | 201706701 |
| National Science Foundation | 1565387 |
| Natural Sciences and Engineering Research Council of Canada |