Targeted opponent modeling of memory-bounded agents

Doran Chakraborty, Noa Agmon, Peter Stone

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

In a repeated game, a memory-bounded agent selects its next action by basing its policy on a fixed window of past L plays. Traditionally, approaches that attempt to model memory-bounded agents, do so by modeling them based on the past L joint actions. Since the number of possible L sized joint actions grows exponentially with L, these approaches are restricted to modeling agents with a small L. This paper explores an alternative, more efficient mechanism for modeling memory-bounded agents based on high-level features derived from the past L plays. Called Targeted Opponent Modeler against Memory-Bounded Agents, or Tommba, our approach successfully models memory-bounded agents, in a sample efficient manner, given a priori knowledge of a feature set that includes the correct features. Tommba is fully implemented, with successful empirical results in a couple of challenging surveillance based tasks.

Original languageEnglish
Title of host publicationAAMAS 2013 Workshop on Adaptive and Learning Agents, ALA 2013
PublisherAAMAS
ISBN (Print)9781943580125
StatePublished - 2013
EventAAMAS 2013 Workshop on Adaptive and Learning Agents, ALA 2013 - Saint Paul, United States
Duration: 6 May 20137 May 2013

Publication series

NameAAMAS 2013 Workshop on Adaptive and Learning Agents, ALA 2013

Conference

ConferenceAAMAS 2013 Workshop on Adaptive and Learning Agents, ALA 2013
Country/TerritoryUnited States
CitySaint Paul
Period6/05/137/05/13

Keywords

  • Learning
  • Memory-bounded agents
  • Modeling

Fingerprint

Dive into the research topics of 'Targeted opponent modeling of memory-bounded agents'. Together they form a unique fingerprint.

Cite this