Title :
Optical character recognition for multilingual documents: Amazigh-French
Author :
El Gajoui, Khadija ; Ataa Allah, Fadoua
Author_Institution :
Lab. of Res. in Inf. & Telecommun., Mohamed-V Univ., Rabat, Morocco
Abstract :
Optical Character Recognition (OCR) is a process that allows converting scanned or photographed images of typewritten or printed text into editable text. The OCR studies have been explored towards many languages. However, there are not many reliable OCR systems available for the Amazigh language. Furthermore, the existed studies focus only on Tifinagh writing system, an alphabet that has been recently generalized with the creation of the Royal Institute of Amazigh Culture, in 2001. Hence, it is important to treat Amazigh writing transcribed in Latin or Arabic alphabet, which was the most used in Morocco. In this paper, we focus our study on Amazigh documents transcribed in Latin.
Keywords :
document image processing; natural language processing; optical character recognition; Amazigh documents; Amazigh language; Amazigh-French; Arabic alphabet; Latin alphabet; OCR; Tifinagh writing system; editable text; multilingual documents; optical character recognition; Image segmentation; Optical character recognition software; Optical imaging; Sociology; Statistics; Telecommunications; Writing; Amazigh languag; image; multilingual text; optical character recognition;
Conference_Titel :
Complex Systems (WCCS), 2014 Second World Conference on
Print_ISBN :
978-1-4799-4648-8
DOI :
10.1109/ICoCS.2014.7061005