Feature vector quality and distributional similarity

Maayan Geffet, Ido Dagan

Research output: Contribution to conferencePaperpeer-review

23 Scopus citations

Abstract

We suggest a new goal and evaluation criterion for word similarity measures. The new criterion - meaning-entailing substitutability - fits the needs of semantic-oriented NLP applications and can be evaluated directly (independent of an application) at a good level of human agreement. Motivated by this semantic criterion we analyze the empirical quality of distributional word feature vectors and its impact on word similarity results, proposing an objective measure for evaluating feature vector quality. Finally, a novel feature weighting and selection function is presented, which yields superior feature vectors and better word similarity performance.

Original languageEnglish
StatePublished - 2004
Event20th International Conference on Computational Linguistics, COLING 2004 - Geneva, Switzerland
Duration: 23 Aug 200427 Aug 2004

Conference

Conference20th International Conference on Computational Linguistics, COLING 2004
Country/TerritorySwitzerland
CityGeneva
Period23/08/0427/08/04

Bibliographical note

Publisher Copyright:
© 2004 COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. All rights reserved.

Fingerprint

Dive into the research topics of 'Feature vector quality and distributional similarity'. Together they form a unique fingerprint.

Cite this