Diagnosing AI Explanation Methods with Folk Concepts of Behavior

Alon Jacovi, Jasmijn Bastings, Sebastian Gehrmann, Yoav Goldberg, Katja Filippova

Research output: Contribution to journalArticlepeer-review


We investigate a formalism for the conditions of a successful explanation of AI. We consider “success” to depend not only on what information the explanation contains, but also on what information the human explainee understands from it. Theory of mind literature discusses the folk concepts that humans use to understand and generalize behavior. We posit that folk concepts of behavior provide us with a “language” that humans understand behavior with. We use these folk concepts as a framework of social attribution by the human explainee-the information constructs that humans are likely to comprehend from explanations-by introducing a blueprint for an explanatory narrative (Figure 1) that explains AI behavior with these constructs. We then demonstrate that many XAI methods today can be mapped to folk concepts of behavior in a qualitative evaluation. This allows us to uncover their failure modes that prevent current methods from explaining successfully-i.e., the information constructs that are missing for any given XAI method, and whose inclusion can decrease the likelihood of misunderstanding AI behavior.

Original languageEnglish
Pages (from-to)459-489
Number of pages31
JournalJournal of Artificial Intelligence Research
StatePublished - 2023

Bibliographical note

Publisher Copyright:
© 2023 The Authors. Published by AI Access Foundation under Creative Commons Attribution License CC BY 4.0.


We are grateful to Tim Miller, Been Kim, Sara Hooker, Hendrik Schuff and Sarah Wiegreffe for valuable discussion and feedback, and to Kremena Goranova and her cats Pippa and Peppi for adorable cat and whiskers pictures in Figures 4 and 5. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, grant agreement No. 802774 (iEXTRACT).

FundersFunder number
Horizon 2020 Framework Programme802774
European Research Council


    Dive into the research topics of 'Diagnosing AI Explanation Methods with Folk Concepts of Behavior'. Together they form a unique fingerprint.

    Cite this