Abstract
Cross-document event coreference resolution (CDCR) is the task of detecting and clustering mentions of events across a set of documents. A major bottleneck in CDCR is a lack of appropriate datasets, which stems from the difficulty of annotating data for this task. We present the first scalable approach for annotating cross-subtopic event coreference links, a highly valuable but rarely occurring type of cross-document link. The annotation of these links requires combing through hundreds of documents - an endeavor for which conventional token-level annotation schemes with trained expert annotators are too expensive. We instead propose crowdsourcing annotation on sentence level to achieve scalability.
Original language | English |
---|---|
Pages (from-to) | 23-29 |
Number of pages | 7 |
Journal | CEUR Workshop Proceedings |
Volume | 2593 |
State | Published - 2020 |
Event | 3rd Workshop on Narrative Extraction From Texts, Text2Story 2020 - Lisbon, Portugal Duration: 14 Apr 2020 → … |
Bibliographical note
Publisher Copyright:Copyright © by the paper's authors.
Funding
The authors would like to thank the anonymous reviewers for their helpful insights. This work was supported by the German Research Foundation under grant №GU 798/17-1.
Funders | Funder number |
---|---|
Deutsche Forschungsgemeinschaft | GU 798/17-1 |