Monte-carlo tree search using batch value of perfect information

Shahaf S. Shperberg, Solomon Eyal Shimony, Ariel Felner

Research output: Contribution to conferencePaperpeer-review

4 Scopus citations

Abstract

This paper focuses on the selection phase of Monte- Carlo Tree Search (MCTS). We define batch value of perfect information (BVPI) in game trees as a generalization of value of computation as proposed by Russell and Wefald, and use it for selecting nodes to sample in MCTS. We show that computing the BVPI is NP-hard, but it can be approximated in polynomial time. In addition, we propose methods that intelligently find sets of fringe nodes with high BVPI, and quickly select nodes to sample from these sets. We apply our new BVPI methods to partial game trees, both in a stand-alone set of tests, and as a component of a full MCTS algorithm. Empirical results show that our BVPI methods outperform existing node-selection methods for MCTS in different scenarios.

Original languageEnglish
StatePublished - 2017
Externally publishedYes
Event33rd Conference on Uncertainty in Artificial Intelligence, UAI 2017 - Sydney, Australia
Duration: 11 Aug 201715 Aug 2017

Conference

Conference33rd Conference on Uncertainty in Artificial Intelligence, UAI 2017
Country/TerritoryAustralia
CitySydney
Period11/08/1715/08/17

Bibliographical note

Funding Information:
Supported by ISF grant 417/13, and by the Frankel Center. We thank the authors of [Justesen et al., 2014; Eyerich et al., 2010] for providing their code.

Funding

Supported by ISF grant 417/13, and by the Frankel Center. We thank the authors of [Justesen et al., 2014; Eyerich et al., 2010] for providing their code.

FundersFunder number
Frankel Center
Israel Science Foundation417/13

    Fingerprint

    Dive into the research topics of 'Monte-carlo tree search using batch value of perfect information'. Together they form a unique fingerprint.

    Cite this