Meta-learning within Projective Simulation

Adi Makmal, Alexey A. Melnikov, Vedran Dunjko, Hans J. Briegel

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

Learning models of artificial intelligence can nowadays perform very well on a large variety of tasks. However, in practice, different task environments are best handled by different learning models, rather than a single universal approach. Most non-trivial models thus require the adjustment of several to many learning parameters, which is often done on a case-by-case basis by an external party. Meta-learning refers to the ability of an agent to autonomously and dynamically adjust its own learning parameters or meta-parameters. In this paper, we show how projective simulation, a recently developed model of artificial intelligence, can naturally be extended to account for meta-learning in reinforcement learning settings. The projective simulation approach is based on a random walk process over a network of clips. The suggested meta-learning scheme builds upon the same design and employs clip networks to monitor the agent's performance and to adjust its meta-parameters on the fly. We distinguish between reflex-type adaptation and adaptation through learning, and show the utility of both approaches. In addition, a trade-off between flexibility and learning-time is addressed. The extended model is examined on three different kinds of reinforcement learning tasks, in which the agent has different optimal values of the meta-parameters, and is shown to perform well, reaching near-optimal to optimal success rates in all of them, without ever needing to manually adjust any meta-parameter.

Original languageEnglish
Article number7458793
Pages (from-to)2110-2122
Number of pages13
JournalIEEE Access
Volume4
DOIs
StatePublished - 2016
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Funding

This work was supported in part by the Austrian Science Fund through the FoQuS Project Grant F4012 and in part by the Templeton World Charity Foundation under Grant TWCF0078/AB46.

FundersFunder number
Austrian Science FundF4012
Templeton World Charity FoundationTWCF0078/AB46

    Keywords

    • Machine learning
    • adaptive algorithm
    • learning
    • meta-learning
    • quantum mechanics
    • random processes
    • reinforcement learning

    Fingerprint

    Dive into the research topics of 'Meta-learning within Projective Simulation'. Together they form a unique fingerprint.

    Cite this