DocumentCode :
1579368
Title :
Segmentation algorithm for Arabic handwritten text based on contour analysis
Author :
Osman, Yusra
Author_Institution :
Nile Center for Technol. Res., Khartoum, Sudan
fYear :
2013
Firstpage :
447
Lastpage :
452
Abstract :
Segmentation is the process of dividing the binary image into useful regions according to certain conditions. It is the most important phase in any optical character recognition (OCR) system and its accuracy affects significantly the recognition rate of that system. In cursive nature languages such as Arabic, the segmentation procedure is complicated especially in handwritten documents because writers´ styles differs as well as the special cases of characters overlapping and ligatures. Hence, the design of the segmentation algorithms must be based on general descriptors that most writers follow. In this paper, a segmentation algorithm for Arabic handwriting has been developed. The main idea of the algorithm is to divide the selected image into lines and sub-words. Then, for each subword, the contour of each sub-word is traced. After that, the algorithm detects the exact points where the contour changes its state from a horizontal line to another state of vertical or curved line. Finally, the coordinates of these points are considered as the segmentation points. The algorithm was tested over the IFN/ENIT database words. Over 537 tested words containing 3222 character; the algorithm achieved 89.4% of correct character segmentation points.
Keywords :
feature extraction; handwritten character recognition; image segmentation; natural language processing; optical character recognition; Arabic handwriting; Arabic handwritten text; Arabic languages; IFN-ENIT database words; OCR system; binary image division; contour analysis; curved line; handwritten documents; optical character recognition; segmentation algorithm; segmentation points; subword contour; vertical line; Algorithm design and analysis; Arrays; Classification algorithms; Image segmentation; Indexes; Optical character recognition software; Writing; Arabic language features; Handwritten text; Segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing, Electrical and Electronics Engineering (ICCEEE), 2013 International Conference on
Conference_Location :
Khartoum
Print_ISBN :
978-1-4673-6231-3
Type :
conf
DOI :
10.1109/ICCEEE.2013.6633980
Filename :
6633980
Link To Document :
بازگشت