Abstract
Various works suggest the appeal of incorporating explicit semantic representations when addressing challenging realistic NLP scenarios. Common approaches offer either comprehensive linguistically-based formalisms, like AMR, or alternatively Open-IE, which provides a shallow and partial representation. More recently, an appealing trend introduces semi-structured natural-language structures as an intermediate meaning-capturing representation, often in the form of questions and answers. In this work, we further promote this line of research by considering three prior QA-based semantic representations. These cover verbal, nominalized and discourse-based predications, regarded here as jointly providing a comprehensive representation of textual information - termed QASem. To facilitate this perspective, we investigate how to best utilize pre-trained sequence-to-sequence language models, which seem particularly promising for generating representations that consist of natural language expressions (questions and answers). In particular, we examine and analyze input and output linearization strategies, as well as data augmentation and multitask learning for a scarce training data setup. Consequently, we release the first unified QASem parsing tool, easily applicable for downstream tasks that can benefit from an explicit semi-structured account of information units in text.
Original language | English |
---|---|
Pages | 7742-7756 |
Number of pages | 15 |
State | Published - 2022 |
Event | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 11 Dec 2022 |
Conference
Conference | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 |
---|---|
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 7/12/22 → 11/12/22 |
Bibliographical note
Publisher Copyright:© 2022 Association for Computational Linguistics.