Abstract
Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and ε-Greedy, as well as with the conservative uniform sampling.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 |
| Publisher | AI Access Foundation |
| Pages | 3592-3598 |
| Number of pages | 7 |
| ISBN (Electronic) | 9781577357032 |
| State | Published - 1 Jun 2015 |
| Externally published | Yes |
| Event | 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 - Austin, United States Duration: 25 Jan 2015 → 30 Jan 2015 |
Publication series
| Name | Proceedings of the National Conference on Artificial Intelligence |
|---|---|
| Volume | 5 |
Conference
| Conference | 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 |
|---|---|
| Country/Territory | United States |
| City | Austin |
| Period | 25/01/15 → 30/01/15 |
Bibliographical note
Publisher Copyright:© Copyright 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.