Improving reliability of word similarity evaluation by redesigning annotation task and performance measure

Oded Avraham, Yoav Goldberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We suggest a new method for creating and using gold-standard datasets for word similarity evaluation. Our goal is to improve the reliability of the evaluation, and we do this by redesigning the annotation task to achieve higher inter-rater agreement, and by defining a performance measure which takes the reliability of each annotation decision in the dataset into account.

Original languageEnglish
Title of host publicationProceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
PublisherAssociation for Computational Linguistics (ACL)
Pages106-110
Number of pages5
ISBN (Electronic)9781945626142
StatePublished - 2016
Event1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 7 Aug 2016 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/16 → …

Bibliographical note

Publisher Copyright:
© 2016 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All Rights Reserved.

Funding

The work was supported by the Israeli Science Foundation (grant number 1555/15). We thank Omer Levy for useful discussions.

FundersFunder number
Israel Science Foundation1555/15

    Fingerprint

    Dive into the research topics of 'Improving reliability of word similarity evaluation by redesigning annotation task and performance measure'. Together they form a unique fingerprint.

    Cite this