DocumentCode
2197080
Title
A Novel Arabic Baseline Estimation Algorithm Based on Sub-Words Treatment
Author
Boukerma, Hanene ; Farah, Nadir
Author_Institution
Lab. de Gestion Electron. du Document (LABGED), Univ. 20 Aout 1955, Skikda, Algeria
fYear
2010
fDate
16-18 Nov. 2010
Firstpage
335
Lastpage
338
Abstract
Baseline detection is an essential preprocessing step for many OCR systems, it has a direct effect on the efficiency and reliability of characters segmentation and features extraction stages, which contribute strongly to yielding higher recognition accuracy. For Arabic handwritten, the conventional methods which extract baseline as straight line are ill-suited because some Arabic words may be contracted from two or more sub-words (PAWs), and the distribution of these sub-words can produce different slant angles within the same word. Focused on the source of the problem, we propose a novel Arabic baseline estimation algorithm in which the PAW level is the real basic block to be processed rather than word level. Experimental results using IFN/ENIT [1] database demonstrate the efficiency of the proposed algorithm.
Keywords
edge detection; feature extraction; handwritten character recognition; image segmentation; natural languages; optical character recognition; word processing; Arabic handwritten character recognition; OCR system; PAW level; arabic baseline estimation algorithm; baseline detection; character segmentation reliability; feature extraction; optical character recognition; subword treatment; Arabic handwritten; baseline detection; preprocessing; sub-word extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location
Kolkata
Print_ISBN
978-1-4244-8353-2
Type
conf
DOI
10.1109/ICFHR.2010.58
Filename
5693545
Link To Document