Title :
A robust approach for Arabic printed character segmentation
Author :
Najoua, Ben Amara ; Noureddine, Ellouze
Author_Institution :
Ecole Nationale d´´Ingenieurs de Monastir, Tunisia
Abstract :
In this paper we present the segmentation module of a complete system for the recognition of printed and handwritten Arabic documents. The system which is under development, includes several modules especially for characterising text fonts. It is based on the use of the Hidden Markov Models together with decision trees and a dictionary correction system. After a brief overview of the recent work in the field, a new methodology for segmenting off-line Arabic printed characters is presented. This approach is simple and very efficient. The paper describes the choice of the different primitives that can be extracted from the image of the character and the use of the modulated histogram as well as the number of black segments in a line of pixels. The paper identifies various sources of errors and factors that make the task perfect segmentation, difficult. The present algorithm has been tested with most of the print fonts and is currently being tested for handwritten characters. The results obtained are very promising
Keywords :
hidden Markov models; image segmentation; optical character recognition; Arabic documents; Arabic printed character segmentation; black segments; decision trees; dictionary correction; handwritten characters; hidden Markov models; off-line Arabic printed characters; segmentation module; task perfect segmentation; Character recognition; Decision trees; Dictionaries; Hidden Markov models; Histograms; Image segmentation; Pixel; Robustness; Shape; Testing;
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
DOI :
10.1109/ICDAR.1995.602038