Interleaved versus a priori exploration for repeated navigation in a partially-known graph

Shlomo Argamon-Engelson, Sarit Kraus, Sigalit Sina

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

In this paper, we address the tradeoff between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more effectively. For example, a software agent operating on the World Wide Web may need to learn which sites on the net are most useful, and the most efficient routes to those sites. We compare exploration strategies for a repeated task, where the agent is given some particular task to perform some number of times. Tasks are modeled as navigation on a partially known (deterministic) graph. This paper describes a new utility-based exploration algorithm for repeated tasks which interleaves exploration with task performance. The method takes into account both the costs and the potential benefits (for future task repetitions) of different exploratory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed during repetition of each task. We experimentally evaluated our utility-based interleaved exploration algorithm against a heuristic search algorithm for exploration before task performance (a priori exploration) as well as a randomized interleaved exploration algorithm. We found that for a single repeated task, utility-based interleaved exploration consistently outperforms the alternatives, unless the number of task repetitions is very high. In addition, we extended the algorithms for the case of multiple repeated tasks, where the agent has a different, randomly-chosen task (from a known subset of possible tasks) to perform each time. Here too, we found that utility-based interleaved exploration is clear in most cases.

Original languageEnglish
Pages (from-to)963-986
Number of pages24
JournalInternational Journal of Pattern Recognition and Artificial Intelligence
Volume13
Issue number7
DOIs
StatePublished - Nov 1999
Externally publishedYes

Keywords

  • Expected utility
  • Exploration versus exploitation
  • Navigation
  • Random graphs

Fingerprint

Dive into the research topics of 'Interleaved versus a priori exploration for repeated navigation in a partially-known graph'. Together they form a unique fingerprint.

Cite this