Utility-based on-line exploration for repeated navigation in an embedded graph

Shlomo Argamon-Engelson, Sarit Kraus, Sigalit Sina

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

In this paper, we address the tradeoff between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more effectively. For example, a robot may need to learn the most efficient routes between important sites in its environment. We compare on-line and off-line exploration for a repeated task, where the agent is given some particular task to perform some number of times. Tasks are modeled as navigation on a graph embedded in the plane. This paper describes a utility-based on-line exploration algorithm for repeated tasks, which takes into account both the costs and potential benefits (over future task repetitions) of different exploratory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed on each task repetition. We experimentally evaluated our utility-based on-line algorithm against a heuristic search algorithm for off-line exploration as well as a randomized on-line exploration algorithm. We found that for a single repeated task, utility-based on-line exploration consistently outperforms the alternatives, unless the number of task repetitions is very high. In addition, we extended the algorithms for the case of multiple repeated tasks, where the agent has a different randomly-chosen task to perform each time. Here too, we found that utility-based on-line exploration is often preferred.

Original languageEnglish
Pages (from-to)267-284
Number of pages18
JournalArtificial Intelligence
Volume101
Issue number1-2
DOIs
StatePublished - May 1998

Bibliographical note

Funding Information:
’ This research was supported in part by the National Science Foundation grant number IRI-9724937. * Corresponding author. Email: [email protected]. ’ Supported during part of this work by a fellowship from the Fulbright Foundation. * Email: [email protected]. ’ Email: [email protected].

Funding

’ This research was supported in part by the National Science Foundation grant number IRI-9724937. * Corresponding author. Email: [email protected]. ’ Supported during part of this work by a fellowship from the Fulbright Foundation. * Email: [email protected]. ’ Email: [email protected].

FundersFunder number
Fulbright Foundation
National Science FoundationIRI-9724937
Directorate for Computer and Information Science and Engineering9724937

    Keywords

    • Exploration versus exploitation
    • Navigation on embedded graphs
    • Repeated tasks
    • Utility-based search

    Fingerprint

    Dive into the research topics of 'Utility-based on-line exploration for repeated navigation in an embedded graph'. Together they form a unique fingerprint.

    Cite this