Abstract
Modern systems for multi-hop question answering (QA) typically break questions into a sequence of reasoning steps, termed chain-of-thought (CoT), before arriving at a final answer. Often, multiple chains are sampled and aggregated through a voting mechanism over the final answers, but the intermediate steps themselves are discarded. While such approaches improve performance, they do not consider the relations between intermediate steps across chains and do not provide a unified explanation for the predicted answer. We introduce Multi-Chain Reasoning (MCR), an approach which prompts large language models to meta-reason over multiple chains of thought, rather than aggregate their answers. MCR examines different reasoning chains, mixes information between them and selects the most relevant facts in generating an explanation and predicting the answer. MCR outperforms strong baselines on 7 multi-hop QA datasets. Moreover, our analysis reveals that MCR explanations exhibit high quality, enabling humans to verify its answers.
Original language | English |
---|---|
Title of host publication | EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings |
Editors | Houda Bouamor, Juan Pino, Kalika Bali |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 5942-5966 |
Number of pages | 25 |
ISBN (Electronic) | 9798891760608 |
DOIs | |
State | Published - 2023 |
Event | 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 - Hybrid, Singapore, Singapore Duration: 6 Dec 2023 → 10 Dec 2023 |
Publication series
Name | EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings |
---|
Conference
Conference | 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 |
---|---|
Country/Territory | Singapore |
City | Hybrid, Singapore |
Period | 6/12/23 → 10/12/23 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
We would like to thank Harsh Trivedi, Ofir Press, Mor Geva, Peter Clark and Ashish Sabharwal for their feedback and insightful comments. We thank SerpAPI for their support by granting us an academic discount. This research was partially supported by the Yandex Initiative for Machine Learning and the European Research Council (ERC) under the European Union Horizons 2020 research and innovation programme (grant ERC DELPHI 802800). This work was completed in partial fulfillment of the Ph.D. of Ori Yoran and the Ph.D. of Tomer Wolfson.
Funders | Funder number |
---|---|
European Union Horizons 2020 research and innovation programme | DELPHI 802800 |
Yandex Initiative for Machine Learning | |
European Commission |