Abstract
Question answering models commonly have access to two sources of “knowledge” during inference time: (1) parametric knowledge -the factual knowledge encoded in the model weights, and (2) contextual knowledge - external knowledge (e.g., a Wikipedia passage) given to the model to generate a grounded answer. Having these two sources of knowledge entangled together is a core issue for generative QA models as it is unclear whether the answer stems from the given non-parametric knowledge or not. This unclarity has implications on issues of trust, interpretability and factuality. In this work, we propose a new paradigm in which QA models are trained to disentangle the two sources of knowledge. Using counterfactual data augmentation, we introduce a model that predicts two answers for a given question: one based on given contextual knowledge and one based on parametric knowledge. Our experiments on the Natural Questions dataset show that this approach improves the performance of QA models by making them more robust to knowledge conflicts between the two knowledge sources, while generating useful disentangled answers.
Original language | English |
---|---|
Title of host publication | Long Papers |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 10056-10070 |
Number of pages | 15 |
ISBN (Electronic) | 9781959429722 |
State | Published - 2023 |
Externally published | Yes |
Event | 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada Duration: 9 Jul 2023 → 14 Jul 2023 |
Publication series
Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
---|---|
Volume | 1 |
ISSN (Print) | 0736-587X |
Conference
Conference | 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 |
---|---|
Country/Territory | Canada |
City | Toronto |
Period | 9/07/23 → 14/07/23 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
This work was carried out as part of a Master Sponsored Research Agreement between the Hebrew University and Google, and was also supported by a gift from Google. We thank Google Cloud for providing us with credits for running experiments on the Google Cloud Platform. This work was also supported in part by the Israel Science Foundation (grant no. 2424/21).
Funders | Funder number |
---|---|
Google Cloud | |
Hebrew University of Jerusalem | |
Israel Science Foundation | 2424/21 |