Projective simulation with generalization

Alexey A. Melnikov, Adi Makmal, Vedran Dunjko, Hans J. Briegel

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

The ability to generalize is an important feature of any intelligent agent. Not only because it may allow the agent to cope with large amounts of data, but also because in some environments, an agent with no generalization capabilities cannot learn. In this work we outline several criteria for generalization, and present a dynamic and autonomous machinery that enables projective simulation agents to meaningfully generalize. Projective simulation, a novel, physical approach to artificial intelligence, was recently shown to perform well in standard reinforcement learning problems, with applications in advanced robotics as well as quantum experiments. Both the basic projective simulation model and the presented generalization machinery are based on very simple principles. This allows us to provide a full analytical analysis of the agent's performance and to illustrate the benefit the agent gains by generalizing. Specifically, we show that already in basic (but extreme) environments, learning without generalization may be impossible, and demonstrate how the presented generalization machinery enables the projective simulation agent to learn.

Original languageEnglish
Article number14430
JournalScientific Reports
Volume7
Issue number1
DOIs
StatePublished - 31 Oct 2017
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2017 The Author(s).

Funding

We wish to thank Markus Tiersch, Dan Browne and Elham Kashefi for helpful discussions. This work was supported in part by the Austrian Science Fund (FWF) through Grant No. SFB FoQuS F4012, and by the Templeton World Charity Foundation (TWCF) through Grant No. TWCF0078/AB46.

FundersFunder number
Austrian Science FundF4012
Templeton World Charity Foundation

    Fingerprint

    Dive into the research topics of 'Projective simulation with generalization'. Together they form a unique fingerprint.

    Cite this