Abstract
As multi-agent reinforcement learning (MARL) systems are increasingly deployed throughout society, it is imperative yet challenging for users to understand the emergent behaviors of MARL agents in complex environments. This work presents an approach for generating policy-level contrastive explanations for MARL to answer a temporal user query, which specifies a sequence of tasks completed by agents with possible cooperation. The proposed approach encodes the temporal query as a PCTL* logic formula and checks if the query is feasible under a given MARL policy via probabilistic model checking. Such explanations can help reconcile discrepancies between the actual and anticipated multi-agent behaviors. The proposed approach also generates correct and complete explanations to pinpoint reasons that make a user query infeasible. We have successfully applied the proposed approach to four benchmark MARL domains (up to 9 agents in one domain). Moreover, the results of a user study show that the generated explanations significantly improve user performance and satisfaction.
Original language | English |
---|---|
Title of host publication | Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 |
Editors | Edith Elkind |
Publisher | International Joint Conferences on Artificial Intelligence |
Pages | 55-63 |
Number of pages | 9 |
ISBN (Electronic) | 9781956792034 |
DOIs | |
State | Published - 2023 |
Event | 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 - Macao, China Duration: 19 Aug 2023 → 25 Aug 2023 |
Publication series
Name | IJCAI International Joint Conference on Artificial Intelligence |
---|---|
Volume | 2023-August |
ISSN (Print) | 1045-0823 |
Conference
Conference | 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023 |
---|---|
Country/Territory | China |
City | Macao |
Period | 19/08/23 → 25/08/23 |
Bibliographical note
Publisher Copyright:© 2023 International Joint Conferences on Artificial Intelligence. All rights reserved.
Funding
This work was supported in part by U.S. National Science Foundation under grant CCF-1942836, U.S. Office of Naval Research under grant N00014-18-1-2829, U.S. Air Force Office of Scientific Research under grant FA9550-21-1-0164, Israel Science Foundation under grant 1958/20, and the EU Project TAILOR under grant 952215. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the grant sponsors.
Funders | Funder number |
---|---|
National Science Foundation | CCF-1942836 |
Office of Naval Research | N00014-18-1-2829 |
Air Force Office of Scientific Research | FA9550-21-1-0164 |
Emory University | 952215 |
Israel Science Foundation | 1958/20 |