Abstract
We compare four similarity-based estimation methods against back-off and maximum-likelihood estimation methods on a pseudo-word sense disambiguation task in which we controlled for both unigram and bigram frequency. The similarity-based methods perform up to 40% better on this particular task. We also conclude that events that occur only once in the training set have major impact on similarity-based estimates.
Original language | English |
---|---|
Pages (from-to) | 56-63 |
Number of pages | 8 |
Journal | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
Volume | 1997-July |
State | Published - 1997 |
Event | 35th Annual Meeting of the Association for Computational Linguistics, ACL 1997 and 8th Conference of the European Chapter of the Association for Computational Linguistics, EACL 1997 - Madrid, Spain Duration: 7 Jul 1997 → 12 Jul 1997 |
Bibliographical note
Publisher Copyright:© 1997 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All Rights Reserved.