Processing Judeo-Arabic texts

Kfir Bar, Nachum Dershowitz, Lior Wolf, Yackov Lubarsky, Yaacov Choueka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Judeo-Arabic is a set of dialects spoken and written by Jewish communities living in Arab countries. Judeo-Arabic is typically written in Hebrew letters, enriched with diacritic marks that relate to the underlying Arabic. However, some inconsistencies in rendering words in Hebrew letters increase the level of ambiguity of a given word. Furthermore, Judeo-Arabic texts usually contain non-Arabic words and phrases, such as quotations or borrowed words from Hebrew and Aramaic. We focus on two main tasks: (1) automatic transliteration of Judeo-Arabic Hebrew letters into Arabic letters, and (2) automatic identification of language switching points between Judeo-Arabic and Hebrew. For transliteration, we employ a statistical translation system trained on the character level, resulting in 96.9% precision, a significant improvement over the baseline. For the language switching task, we use a word-level supervised classifier, also showing some significant improvements over the baseline.

Original languageEnglish
Title of host publicationProceedings - 1st International Conference on Arabic Computational Linguistics
Subtitle of host publicationAdvances in Arabic Computational Linguistics, ACLing 2015
EditorsAlexander Gelbukh, Khaled Shaalan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages138-144
Number of pages7
ISBN (Electronic)9781467391559
DOIs
StatePublished - 29 Feb 2016
Externally publishedYes
Event1st International Conference on Arabic Computational Linguistics, ACLing 2015 - Cairo, Egypt
Duration: 17 Apr 201520 Apr 2015

Publication series

NameProceedings - 1st International Conference on Arabic Computational Linguistics: Advances in Arabic Computational Linguistics, ACLing 2015

Conference

Conference1st International Conference on Arabic Computational Linguistics, ACLing 2015
Country/TerritoryEgypt
CityCairo
Period17/04/1520/04/15

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Keywords

  • Code switching
  • Judeo-Arabic
  • Transliteration

Fingerprint

Dive into the research topics of 'Processing Judeo-Arabic texts'. Together they form a unique fingerprint.

Cite this