Abstract
Cross-document event coreference resolution (CDCR) is the task of detecting and clustering mentions of events across a set of documents. A major bottleneck in CDCR is a lack of appropriate datasets, which stems from the difficulty of annotating data for this task. We present the first scalable approach for annotating cross-subtopic event coreference links, a highly valuable but rarely occurring type of cross-document link. The annotation of these links requires combing through hundreds of documents - an endeavor for which conventional token-level annotation schemes with trained expert annotators are too expensive. We instead propose crowdsourcing annotation on sentence level to achieve scalability.
| Original language | English |
|---|---|
| Pages (from-to) | 23-29 |
| Number of pages | 7 |
| Journal | CEUR Workshop Proceedings |
| Volume | 2593 |
| State | Published - 2020 |
| Event | 3rd Workshop on Narrative Extraction From Texts, Text2Story 2020 - Lisbon, Portugal Duration: 14 Apr 2020 → … |
Bibliographical note
Publisher Copyright:Copyright © by the paper's authors.
Funding
The authors would like to thank the anonymous reviewers for their helpful insights. This work was supported by the German Research Foundation under grant №GU 798/17-1.
| Funders | Funder number |
|---|---|
| Deutsche Forschungsgemeinschaft | GU 798/17-1 |