Refaie, M., Imam, I., Eissa, I. (2015). Bilingual Language Model for English Arabic Technical Translation. The Egyptian Journal of Language Engineering, 2(2), 22-31. doi: 10.21608/ejle.2015.60193
Marwa N. Refaie; Ibrahim I Imam; Ibrahim F. Eissa. "Bilingual Language Model for English Arabic Technical Translation". The Egyptian Journal of Language Engineering, 2, 2, 2015, 22-31. doi: 10.21608/ejle.2015.60193
Refaie, M., Imam, I., Eissa, I. (2015). 'Bilingual Language Model for English Arabic Technical Translation', The Egyptian Journal of Language Engineering, 2(2), pp. 22-31. doi: 10.21608/ejle.2015.60193
Refaie, M., Imam, I., Eissa, I. Bilingual Language Model for English Arabic Technical Translation. The Egyptian Journal of Language Engineering, 2015; 2(2): 22-31. doi: 10.21608/ejle.2015.60193
Bilingual Language Model for English Arabic Technical Translation
1Modern University for Technology and Information, Faculty of Computer Science
2Arab Academy for Science, Technology and Maritime Transport.
3Faculty of Computers and Information – Cairo University
Abstract
The massive fast of new scientific publications increase the need to a reliable effective automatic machine translation (AMT) system, which translates from English, as the common language of publications, to other different languages. Statistical machine translation (SMT) model crafted to deal with certain domain of text often fails when subjected to another domain. The paper addresses the characterization of language domains and their behavior in SMT, experiments the management of SMT model to translate scientific text collected from artificial intelligence publications. The effectiveness of Bilingual language model is tested against the typical N-gram language model, in addition to utilizing the fill-up and back-off techniques to handle different phrase tables from different domains. As not every human capable to translate artificial intelligence book, should have strong knowledge in the field, We suggest that in order AMT can handle different domains it must be trained by in-domain parallel data, adjusting weights for the words on different domains to learn the model how to differentiate between different meaning of same word in different domains.