Abstract
Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it. While some popular approaches address summarization as a single end-to-end task, prominent works support decomposed modeling for individual subtasks. Further, semi-automated text reduction is also very appealing, where users may identify targeted content while models would generate a corresponding coherent summary. In this paper, we focus on the second subtask, of generating coherent text given pre-selected content. Concretely, we formalize Controlled Text Reduction as a standalone task, whose input is a source text with marked spans of targeted content ("highlighting"). A model then needs to generate a coherent text that includes all and only the target information. We advocate the potential of such models, both for modular fully-automatic summarization, as well as for semi-automated human-in-the-loop use cases. Facilitating proper research, we crowd-source high-quality dev and test datasets for the task. Further, we automatically generate a larger "silver" training dataset from available summarization benchmarks, leveraging a pre-trained summary-source alignment model. Finally, employing these datasets, we present a supervised baseline model, showing promising results and insightful analyses.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 |
| Editors | Yoav Goldberg, Zornitsa Kozareva, Yue Zhang |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 5699-5715 |
| Number of pages | 17 |
| ISBN (Electronic) | 9781959429401 |
| DOIs | |
| State | Published - 2022 |
| Event | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Hybrid, Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 11 Dec 2022 |
Publication series
| Name | Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 |
|---|
Conference
| Conference | 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 |
|---|---|
| Country/Territory | United Arab Emirates |
| City | Hybrid, Abu Dhabi |
| Period | 7/12/22 → 11/12/22 |
Bibliographical note
Publisher Copyright:© 2022 Association for Computational Linguistics.