Abstract
Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it. While some popular approaches address summarization as a single end-to-end task, prominent works support decomposed modeling for individual subtasks. Further, semi-automated text reduction is also very appealing, where users may identify targeted content while models would generate a corresponding coherent summary. In this paper, we focus on the second subtask, of generating coherent text given pre-selected content. Concretely, we formalize Controlled Text Reduction as a standalone task, whose input is a source text with marked spans of targeted content ("highlighting"). A model then needs to generate a coherent text that includes all and only the target information. We advocate the potential of such models, both for modular fully-automatic summarization, as well as for semi-automated human-in-the-loop use cases. Facilitating proper research, we crowd-source high-quality dev and test datasets for the task. Further, we automatically generate a larger "silver" training dataset from available summarization benchmarks, leveraging a pre-trained summary-source alignment model. Finally, employing these datasets, we present a supervised baseline model, showing promising results and insightful analyses.
Original language | English |
---|---|
Pages | 5699-5715 |
Number of pages | 17 |
DOIs | |
State | Published - 2022 |
Event | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 11 Dec 2022 |
Conference
Conference | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 |
---|---|
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 7/12/22 → 11/12/22 |
Bibliographical note
Publisher Copyright:© 2022 Association for Computational Linguistics.
Funding
This work was supported by Intel Labs, the Israel Science Foundation (grant no. 2827/21), and a grant from the Israel Ministry of Science and Technology.
Funders | Funder number |
---|---|
Intel Labs | |
Israel Science Foundation | 2827/21 |
Ministry of science and technology, Israel |