1 Scopus citations

Abstract

Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection, followed by text generation. In this context, alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data for some of the component tasks. Yet, this enabling alignment step has usually been applied heuristically on the sentence level on a limited number of subtasks. In this paper, we propose extending the summary-source alignment framework by (1) applying it at the more fine-grained proposition span level, (2) annotating alignment manually in a multi-document setup, and (3) revealing the great potential of summary-source alignments to yield several datasets for at least six different tasks. Specifically, for each of the tasks, we release a manually annotated test set that was derived automatically from the alignment annotation. We also release development and train sets in the same way, but from automatically derived alignments. Using the datasets, each task is demonstrated with baseline models and corresponding evaluation metrics to spur future research on this broad challenge.

Original languageEnglish
Title of host publicationThe 62nd Annual Meeting of the Association for Computational Linguistics
Subtitle of host publicationFindings of the Association for Computational Linguistics, ACL 2024
EditorsLun-Wei Ku, Andre Martins, Vivek Srikumar
PublisherAssociation for Computational Linguistics (ACL)
Pages6527-6548
Number of pages22
ISBN (Electronic)9798891760998
DOIs
StatePublished - 2024
EventFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Hybrid, Bangkok, Thailand
Duration: 11 Aug 202416 Aug 2024

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

ConferenceFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Country/TerritoryThailand
CityHybrid, Bangkok
Period11/08/2416/08/24

Bibliographical note

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'The Power of Summary-Source Alignments'. Together they form a unique fingerprint.

Cite this