TY - GEN
T1 - Abbreviation disambiguation
T2 - 13th International Conference on Natural Language and Information Systems, NLDB 2008
AU - HaCohen-Kerner, Yaakov
AU - Kass, Ariel
AU - Peretz, Ariel
PY - 2008
Y1 - 2008
N2 - Abbreviations are very common and are widely used in both written and spoken language. However, they are not always explicitly defined and in many cases they are ambiguous. In this research, we present a process that attempts to solve the problem of abbreviation ambiguity. Various features have been explored, including context-related methods and statistical methods. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Various variants of the one sense per discourse hypothesis (by varying the scope of discourse) have been implemented. Several common machine learning methods have been tested to find a successful integration of these variants. The best results have been achieved by SVM, with 96.09% accuracy.
AB - Abbreviations are very common and are widely used in both written and spoken language. However, they are not always explicitly defined and in many cases they are ambiguous. In this research, we present a process that attempts to solve the problem of abbreviation ambiguity. Various features have been explored, including context-related methods and statistical methods. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Various variants of the one sense per discourse hypothesis (by varying the scope of discourse) have been implemented. Several common machine learning methods have been tested to find a successful integration of these variants. The best results have been achieved by SVM, with 96.09% accuracy.
UR - http://www.scopus.com/inward/record.url?scp=47749110638&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-69858-6_5
DO - 10.1007/978-3-540-69858-6_5
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:47749110638
SN - 3540698574
SN - 9783540698579
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 27
EP - 39
BT - Natural Language and Information Systems - 13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008, Proceedings
Y2 - 24 June 2008 through 27 June 2008
ER -