DocumentCode
3418504
Title
A comparative study between methods of Arabic baseline detection
Author
AL-Shatnawi, Atallah ; Omar, Khairuddin
Author_Institution
Dept. of Syst. Sci. & Manage., Univ. Kebangsaan Malaysia, Bangi, Malaysia
Volume
01
fYear
2009
fDate
5-7 Aug. 2009
Firstpage
73
Lastpage
77
Abstract
Preprocessing is the most important stage in the Arabic OCR system; it has a direct effect on the reliability and efficiency of the segmentation and feature extraction stages. It is worth mentioning that Arabic language is cursively written, and its characters have between two to four shapes. An Arabic word likely consists of two or more characters which are connected through an imaginary line called baseline. Detecting baseline is one of the main majorities in preprocessing Arabic OCR system. The baseline can be used for both skew normalization and character segmentation. In this paper the challenges of the Arabic baseline detection methods are listed and clarified. Also this paper aims to provide a brief comparison between the methods of Arabic baseline detection. The comparison has been done based on each of the natures of the Arabic language written, and the diacritics, such as dots and zigzag, and the word slop, and the subwords found.
Keywords
handwriting recognition; image segmentation; natural language processing; optical character recognition; Arabic OCR system; Arabic baseline detection; Arabic language; Arabic word; character; cursively written; feature extraction stages; imaginary line; skew normalization; Conference management; Feature extraction; Image edge detection; Image segmentation; Informatics; Natural languages; Optical character recognition software; Pattern recognition; Shape; Writing; Arabic; Baseline; Contour; Handwritten; Horizontal Projection; OCR; Offline; Preprocessing; Principle Component Analysis; Skeleton;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical Engineering and Informatics, 2009. ICEEI '09. International Conference on
Conference_Location
Selangor
Print_ISBN
978-1-4244-4913-2
Type
conf
DOI
10.1109/ICEEI.2009.5254814
Filename
5254814
Link To Document