Ml-based arm recommendation in short-horizon mabs

Or Zipori, David Sarne

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In many settings where an agent needs to suggest or recommend a course of action to its user, the agent's goal may not fully align with the user's goal. In particular, the agent may maximize its benefit if the user chooses specific alternatives that are not necessarily the ones that maximize her own individual benefit. In this paper we study such setting in the context of providing advice in two-Armed bandit problems. We explore a potential strategy for the agent aiming to influence the arm to be picked. In particular we focus on a somehow naive recommendation strategy that always recommend the preferred arm and a strategy that recommends based on various Machine Learning models that aim to guide the decision regarding when to switch to the agent's least preferred arm. Based on extensive evaluation we find that both recommendation strategies results in better performance compared to not making any recommendation, and that the naive recommendation strategy performs slightly better than the ML-based recommendations, despite using a substantial amount of training data for the latter.

Original languageEnglish
Title of host publicationHAI 2021 - Proceedings of the 9th International User Modeling, Adaptation and Personalization Human-Agent Interaction
PublisherAssociation for Computing Machinery, Inc
Pages377-381
Number of pages5
ISBN (Electronic)9781450386203
DOIs
StatePublished - 9 Nov 2021
Event9th International User Modeling, Adaptation and Personalization Human-Agent Interaction, HAI 2021 - Virtual, Online, Japan
Duration: 9 Nov 202111 Nov 2021

Publication series

NameHAI 2021 - Proceedings of the 9th International User Modeling, Adaptation and Personalization Human-Agent Interaction

Conference

Conference9th International User Modeling, Adaptation and Personalization Human-Agent Interaction, HAI 2021
Country/TerritoryJapan
CityVirtual, Online
Period9/11/2111/11/21

Bibliographical note

Publisher Copyright:
© 2021 Owner/Author.

Keywords

  • Hai experimental methods
  • Human-virtual agent interaction
  • Machine learning
  • Monte-carlo simulation
  • Multi armed bandit
  • Recommender agents

Fingerprint

Dive into the research topics of 'Ml-based arm recommendation in short-horizon mabs'. Together they form a unique fingerprint.

Cite this