Contextual word similarity and estimation from sparse data

Ido Dagan, Shaul Marcus, Shaul Markovitch

Research output: Contribution to journalConference articlepeer-review

74 Scopus citations

Abstract

In recent years there is much interest in word cooccurrence relations, such as n-grams, verb-object combinations, or cooccurrence within a limited context. This paper discusses how to estimate the probability of cooccurrences that do not occur in the training data. We present a method that makes local analogies between each specific unobserved cooccurrence and other cooccurrences that contain similar words, as determined by an appropriate word similarity metric. Our evaluation suggests that this method performs better than existing smoothing methods, and may provide an alternative to class based models.

Original languageEnglish
Pages (from-to)164-171
Number of pages8
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1993-June
StatePublished - 1993
Externally publishedYes
Event31st Annual Meeting of the Association for Computational Linguistics, ACL 1993 - Columbus, United States
Duration: 22 Jun 199326 Jun 1993

Bibliographical note

Publisher Copyright:
© 1993 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All rights reserved.

Fingerprint

Dive into the research topics of 'Contextual word similarity and estimation from sparse data'. Together they form a unique fingerprint.

Cite this