Abstract
Current state-of-the-art coreference systems are based on a single pairwise scoring component, which assigns to each pair of mention spans a score reflecting their tendency to core-fer to each other. We observe that different kinds of mention pairs require different information sources to assess their score. We present LINGMESS, a linguistically motivated categorization of mention-pairs into 6 types of coreference decisions and learn a dedicated trainable scoring function for each category. This significantly improves the accuracy of the pairwise scorer as well as of the overall coreference performance on the English Ontonotes coreference corpus and 5 additional datasets.
| Original language | English |
|---|---|
| Title of host publication | EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 2744-2752 |
| Number of pages | 9 |
| ISBN (Electronic) | 9781959429449 |
| DOIs | |
| State | Published - 2023 |
| Event | 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Dubrovnik, Croatia, Croatia Duration: 2 May 2023 → 6 May 2023 |
Publication series
| Name | EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference |
|---|
Conference
| Conference | 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 |
|---|---|
| Country/Territory | Croatia |
| City | Dubrovnik, Croatia |
| Period | 2/05/23 → 6/05/23 |
Bibliographical note
Publisher Copyright:© 2023 Association for Computational Linguistics.
Funding
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, grant agreement No. 802774 (iEX-TRACT). Arie Cattan is partially supported by the PBC fellowship for outstanding PhD candidates in data science. This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme, grant agreement No. 802774 (iEXTRACT). Arie Cattan is partially supported by the PBC fellowship for outstanding PhD candidates in data science.
| Funders | Funder number |
|---|---|
| Horizon 2020 Framework Programme | |
| European Commission | |
| Horizon 2020 | 802774 |
| Planning and Budgeting Committee of the Council for Higher Education of Israel |