• DocumentCode
    3756116
  • Title

    Processing Judeo-Arabic Texts

  • Author

    Kfir Bar;Nachum Dershowitz;Lior Wolf;Yackov Lubarsky;Yaacov Choueka

  • Author_Institution
    Sch. of Comput. Sci., Tel Aviv Univ., Ramat Aviv, Israel
  • fYear
    2015
  • fDate
    4/1/2015 12:00:00 AM
  • Firstpage
    138
  • Lastpage
    144
  • Abstract
    Judeo-Arabic is a set of dialects spoken and written by Jewish communities living in Arab countries. Judeo-Arabic is typically written in Hebrew letters, enriched with diacritic marks that relate to the underlying Arabic. However, some inconsistencies in rendering words in Hebrew letters increase the level of ambiguity of a given word. Furthermore, Judeo-Arabic texts usually contain non-Arabic words and phrases, such as quotations or borrowed words from Hebrew and Aramaic. We focus on two main tasks: (1) automatic transliteration of Judeo-Arabic Hebrew letters into Arabic letters, and (2) automatic identification of language switching points between Judeo-Arabic and Hebrew. For transliteration, we employ a statistical translation system trained on the character level, resulting in 96.9% precision, a significant improvement over the baseline. For the language switching task, we use a word-level supervised classifier, also showing some significant improvements over the baseline.
  • Keywords
    "Switches","Rendering (computer graphics)","Writing","Noise measurement","Feature extraction","Internet","Morphology"
  • Publisher
    ieee
  • Conference_Titel
    Arabic Computational Linguistics (ACLing), 2015 First International Conference on
  • Print_ISBN
    978-1-4673-9154-2
  • Type

    conf

  • DOI
    10.1109/ACLing.2015.27
  • Filename
    7422292