On interruptible pure exploration in multi-armed bandits

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and ε-Greedy, as well as with the conservative uniform sampling.

Original languageEnglish
Title of host publicationProceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
PublisherAI Access Foundation
Pages3592-3598
Number of pages7
ISBN (Electronic)9781577357032
StatePublished - 1 Jun 2015
Externally publishedYes
Event29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 - Austin, United States
Duration: 25 Jan 201530 Jan 2015

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume5

Conference

Conference29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
Country/TerritoryUnited States
CityAustin
Period25/01/1530/01/15

Bibliographical note

Publisher Copyright:
© Copyright 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Fingerprint

Dive into the research topics of 'On interruptible pure exploration in multi-armed bandits'. Together they form a unique fingerprint.

Cite this