TY - GEN
T1 - Identification of transliterated foreign words in Hebrew script
AU - Goldberg, Yoav
AU - Elhadad, Michael
PY - 2008
Y1 - 2008
N2 - We present a loosely-supervised method for context-free identification of transliterated foreign names and borrowed words in Hebrew text. The method is purely statistical and does not require the use of any lexicons or linguistic analysis tool for the source languages (Hebrew, in our case). It also does not require any manually annotated data for training - we learn from noisy data acquired by over-generation. We report precision/recall results of 80/82 for a corpus of 4044 unique words, containing 368 foreign words.
AB - We present a loosely-supervised method for context-free identification of transliterated foreign names and borrowed words in Hebrew text. The method is purely statistical and does not require the use of any lexicons or linguistic analysis tool for the source languages (Hebrew, in our case). It also does not require any manually annotated data for training - we learn from noisy data acquired by over-generation. We report precision/recall results of 80/82 for a corpus of 4044 unique words, containing 368 foreign words.
UR - http://www.scopus.com/inward/record.url?scp=49949110947&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-78135-6_40
DO - 10.1007/978-3-540-78135-6_40
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:49949110947
SN - 354078134X
SN - 9783540781349
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 466
EP - 477
BT - Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings
T2 - 9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008
Y2 - 17 February 2008 through 23 February 2008
ER -