Abstract
Figurative language is ubiquitous in English. Yet, the vast majority of NLP research focuses on literal language. Existing text representations by design rely on compositionality, while figurative language is often non-compositional. In this paper, we study the interpretation of two non-compositional figurative languages (idioms and similes). We collected datasets of fictional narratives containing a figurative expression along with crowd-sourced plausible and implausible continuations relying on the correct interpretation of the expression. We then trained models to choose or generate the plausible continuation. Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks. We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language types: inferring meaning from the context and relying on the constituent words’ literal meanings. The knowledge-enhanced models improve the performance on both the discriminative and generative tasks, further bridging the gap from human performance.
Original language | English |
---|---|
Pages (from-to) | 589-606 |
Number of pages | 18 |
Journal | Transactions of the Association for Computational Linguistics |
Volume | 10 |
DOIs | |
State | Published - 16 May 2022 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license.
Funding
This research was supported in part by DARPA under the MCS program through NIWC Pacific (N66001-19-2-4031), Google Cloud computing, and the Allen Institute for AI (AI2).
Funders | Funder number |
---|---|
Google Cloud computing | |
Defense Advanced Research Projects Agency | |
Naval Information Warfare Center Pacific | N66001-19-2-4031 |
ALLEN INSTITUTE |