OPENASP: A Benchmark for Multi-document Open Aspect-based Summarization

Shmuel Amar, Liat Schiff, Ori Ernst, Asi Shefer, Ori Shapira, Ido Dagan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The performance of automatic summarization models has improved dramatically in recent years. Yet, there is still a gap in meeting specific information needs of users in real-world scenarios, particularly when a targeted summary is sought, such as in the useful aspect-based summarization setting targeted in this paper. Previous datasets and studies for this setting have predominantly concentrated on a limited set of pre-defined aspects, focused solely on single document inputs, or relied on synthetic data. To advance research on more realistic scenarios, we introduce OPENASP, a benchmark for multi-document open aspect-based summarization. This benchmark is created using a novel and cost-effective annotation protocol, by which an open aspect dataset is derived from existing generic multi-document summarization datasets. We analyze the properties of OPENASP showcasing its high-quality content. Further, we show that the realistic open-aspect setting realized in OPENASP poses a challenge for current state-of-the-art summarization models, as well as for large language models.

Original languageEnglish
Title of host publicationEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
EditorsHouda Bouamor, Juan Pino, Kalika Bali
PublisherAssociation for Computational Linguistics (ACL)
Pages1967-1991
Number of pages25
ISBN (Electronic)9798891760608
StatePublished - 2023
Event2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 - Hybrid, Singapore, Singapore
Duration: 6 Dec 202310 Dec 2023

Publication series

NameEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
Country/TerritorySingapore
CityHybrid, Singapore
Period6/12/2310/12/23

Bibliographical note

Publisher Copyright:
©2023 Association for Computational Linguistics.

Funding

This work was supported by the Israel Science Foundation (grant no. 2827/21), the Israel Ministry of Science and Technology, and One AI.

FundersFunder number
Israel Science Foundation2827/21
Ministry of science and technology, Israel

    Fingerprint

    Dive into the research topics of 'OPENASP: A Benchmark for Multi-document Open Aspect-based Summarization'. Together they form a unique fingerprint.

    Cite this