TY - GEN
T1 - Efficient agents for cliff-edge environments with a large set of decision options
AU - Katz, Ron
AU - Kraus, Sarit
PY - 2006
Y1 - 2006
N2 - This paper proposes an efficient agent for competing in Cliff Edge (CE) environments, such as sealed-bid auctions, dynamic pricing and the ultimatum game. The agent competes in one-shot CE interactions repeatedly, each time against a different human opponent, and its performance is evaluated based on all the interactions in which it participates. The agent, which learns the general pattern of the population's behavior, does not apply any examples of previous interactions in the environment, neither of other competitors nor its own. We propose a generic approach which competes in different CE environments under the same configuration, with no knowledge about the specific rules of each environment. The underlying mechanism of the proposed agent is a new meta-algorithm, Deviated Virtual Learning (DVL), which extends existing methods to efficiently cope with environments comprising a large number of optional decisions at each decision point. Experiments comparing the performance of the proposed algorithm with algorithms taken from the literature, as well as another intuitive meta-algorithm, reveal a significant superiority of the former in average payoff and stability. In addition, the agent performed better than human competitors executing the same task.
AB - This paper proposes an efficient agent for competing in Cliff Edge (CE) environments, such as sealed-bid auctions, dynamic pricing and the ultimatum game. The agent competes in one-shot CE interactions repeatedly, each time against a different human opponent, and its performance is evaluated based on all the interactions in which it participates. The agent, which learns the general pattern of the population's behavior, does not apply any examples of previous interactions in the environment, neither of other competitors nor its own. We propose a generic approach which competes in different CE environments under the same configuration, with no knowledge about the specific rules of each environment. The underlying mechanism of the proposed agent is a new meta-algorithm, Deviated Virtual Learning (DVL), which extends existing methods to efficiently cope with environments comprising a large number of optional decisions at each decision point. Experiments comparing the performance of the proposed algorithm with algorithms taken from the literature, as well as another intuitive meta-algorithm, reveal a significant superiority of the former in average payoff and stability. In addition, the agent performed better than human competitors executing the same task.
KW - Opponent modeling
KW - Reinforcement learning
KW - Sealed-bid auctions
KW - Ultimatum game
UR - https://www.scopus.com/pages/publications/34247273043
U2 - 10.1145/1160633.1160759
DO - 10.1145/1160633.1160759
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:34247273043
SN - 1595933034
SN - 9781595933034
T3 - Proceedings of the International Conference on Autonomous Agents
SP - 697
EP - 704
BT - Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems
T2 - Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Y2 - 8 May 2006 through 12 May 2006
ER -