Breaking the subtopic barrier in cross-document event coreference resolution

Michael Bugert, Nils Reimers, Shany Barhom, Ido Dagan, Iryna Gurevych

Research output: Contribution to journalConference articlepeer-review

10 Scopus citations

Abstract

Cross-document event coreference resolution (CDCR) is the task of detecting and clustering mentions of events across a set of documents. A major bottleneck in CDCR is a lack of appropriate datasets, which stems from the difficulty of annotating data for this task. We present the first scalable approach for annotating cross-subtopic event coreference links, a highly valuable but rarely occurring type of cross-document link. The annotation of these links requires combing through hundreds of documents - an endeavor for which conventional token-level annotation schemes with trained expert annotators are too expensive. We instead propose crowdsourcing annotation on sentence level to achieve scalability.

Original languageEnglish
Pages (from-to)23-29
Number of pages7
JournalCEUR Workshop Proceedings
Volume2593
StatePublished - 2020
Event3rd Workshop on Narrative Extraction From Texts, Text2Story 2020 - Lisbon, Portugal
Duration: 14 Apr 2020 → …

Bibliographical note

Publisher Copyright:
Copyright © by the paper's authors.

Funding

The authors would like to thank the anonymous reviewers for their helpful insights. This work was supported by the German Research Foundation under grant №GU 798/17-1.

FundersFunder number
Deutsche ForschungsgemeinschaftGU 798/17-1

    Fingerprint

    Dive into the research topics of 'Breaking the subtopic barrier in cross-document event coreference resolution'. Together they form a unique fingerprint.

    Cite this