Abbreviation disambiguation: Experiments with various variants of the one sense per discourse hypothesis

Yaakov HaCohen-Kerner, Ariel Kass, Ariel Peretz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Abbreviations are very common and are widely used in both written and spoken language. However, they are not always explicitly defined and in many cases they are ambiguous. In this research, we present a process that attempts to solve the problem of abbreviation ambiguity. Various features have been explored, including context-related methods and statistical methods. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Various variants of the one sense per discourse hypothesis (by varying the scope of discourse) have been implemented. Several common machine learning methods have been tested to find a successful integration of these variants. The best results have been achieved by SVM, with 96.09% accuracy.

Original languageEnglish
Title of host publicationNatural Language and Information Systems - 13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008, Proceedings
Pages27-39
Number of pages13
DOIs
StatePublished - 2008
Externally publishedYes
Event13th International Conference on Natural Language and Information Systems, NLDB 2008 - London, United Kingdom
Duration: 24 Jun 200827 Jun 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5039 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Natural Language and Information Systems, NLDB 2008
Country/TerritoryUnited Kingdom
CityLondon
Period24/06/0827/06/08

Fingerprint

Dive into the research topics of 'Abbreviation disambiguation: Experiments with various variants of the one sense per discourse hypothesis'. Together they form a unique fingerprint.

Cite this