DocumentCode
2199441
Title
Impact of Character Models Choice on Arabic Text Recognition Performance
Author
Slimane, Fouad ; Ingold, Rolf ; Kanoun, Slim ; Alimi, Adel M. ; Hennebert, Jean
Author_Institution
Dept. of Inf., Univ. of Fribourg, Fribourg, Switzerland
fYear
2010
fDate
16-18 Nov. 2010
Firstpage
670
Lastpage
675
Abstract
We analyze in this paper the impact of sub-models choice for automatic Arabic printed text recognition based on Hidden Markov Models (HMM). In our approach, sub-models correspond to characters shapes assembled to compose words models. One of the peculiarities of Arabic writing is to present various character shapes according to their position in the word. With 28 basic characters, there are over 120 different shapes. Ideally, there should be one sub model for each different shape. However, some shapes are less frequent than others and, as training databases are finite, the learning process leads to less reliable models for the infrequent shapes. We show in this paper that an optimal set of models has then to be found looking for the trade-off between having more models capturing the intricacies of shapes and grouping the models of similar shapes with other. We propose in this paper different sets of sub-models that have been evaluated using the Arabic Printed Text Image (APTI) Database freely available for the scientific community.
Keywords
character recognition; hidden Markov models; image recognition; text analysis; Arabic printed text image database; Arabic printed text recognition; character models choice; hidden Markov model; learning process;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location
Kolkata
Print_ISBN
978-1-4244-8353-2
Type
conf
DOI
10.1109/ICFHR.2010.110
Filename
5693641
Link To Document