START Conference Manager    

Translating Dialectal Arabic to English

Hassan Sajjad, Kareem Darwish and Yonatan Belinkov

The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013


Abstract

We present a dialectal Egyptian Arabic to English statistical machine translation system that leverages dialectal to Modern Standard Arabic (MSA) adaptation. In contrast to previous work, we first narrow down the gap between Egyptian and MSA by applying an automatic character-level transformational model that changes Egyptian to EG', which looks similar to MSA. The transformations include morphological, phonological and spelling changes. The transformation reduces the out-of-vocabulary (OOV) words from 5.2% to 2.6% and gives a gain of 1.87 BLEU points. Further, adapting large MSA/English parallel data increases the lexical coverage, reduces OOVs to 0.7% and leads to an absolute BLEU improvement of 2.73 points. We plan to publicly release the Egyptian/MSA word pairs used for training the conversion model.


START Conference Manager (V2.61.0 - Rev. 2792M)