Realistic Evaluation Principles for Cross-document Coreference Resolution

Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

We point out that common evaluation practices for cross-document coreference resolution have been unrealistically permissive in their assumed settings, yielding inflated results. We propose addressing this issue via two evaluation methodology principles. First, as in other tasks, models should be evaluated on predicted mentions rather than on gold mentions. Doing this raises a subtle issue regarding singleton coreference clusters, which we address by decoupling the evaluation of mention detection from that of coreference linking. Second, we argue that models should not exploit the synthetic topic structure of the standard ECB+ dataset, forcing models to confront the lexical ambiguity challenge, as intended by the dataset creators. We demonstrate empirically the drastic impact of our more realistic evaluation principles on a competitive model, yielding a score which is 33 F1 lower compared to evaluating by prior lenient practices.

Original languageEnglish
Title of host publication*SEM 2021 - 10th Conference on Lexical and Computational Semantics, Proceedings of the Conference
EditorsLun-Wei Ku, Vivi Nastase, Ivan Vulic
PublisherAssociation for Computational Linguistics (ACL)
Pages143-151
Number of pages9
ISBN (Electronic)9781954085770
StatePublished - 2021
Event10th Conference on Lexical and Computational Semantics, *SEM 2021 - Virtual, Bangkok, Thailand
Duration: 5 Aug 20216 Aug 2021

Publication series

Name*SEM 2021 - 10th Conference on Lexical and Computational Semantics, Proceedings of the Conference

Conference

Conference10th Conference on Lexical and Computational Semantics, *SEM 2021
Country/TerritoryThailand
CityVirtual, Bangkok
Period5/08/216/08/21

Bibliographical note

Publisher Copyright:
© 2021 Lexical and Computational Semantics

Fingerprint

Dive into the research topics of 'Realistic Evaluation Principles for Cross-document Coreference Resolution'. Together they form a unique fingerprint.

Cite this