Improving reliability of word similarity evaluation by redesigning annotation task and performance measure

Oded Avraham, Yoav Goldberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We suggest a new method for creating and using gold-standard datasets for word similarity evaluation. Our goal is to improve the reliability of the evaluation, and we do this by redesigning the annotation task to achieve higher inter-rater agreement, and by defining a performance measure which takes the reliability of each annotation decision in the dataset into account.

Original languageEnglish
Title of host publicationProceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
PublisherAssociation for Computational Linguistics (ACL)
Pages106-110
Number of pages5
ISBN (Electronic)9781945626142
StatePublished - 2016
Event1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 7 Aug 2016 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/16 → …

Bibliographical note

Funding Information:
The work was supported by the Israeli Science Foundation (grant number 1555/15). We thank Omer Levy for useful discussions.

Publisher Copyright:
© 2016 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All Rights Reserved.

Fingerprint

Dive into the research topics of 'Improving reliability of word similarity evaluation by redesigning annotation task and performance measure'. Together they form a unique fingerprint.

Cite this