Title :
Page Segmentation of Persian/Arabic Printed Text Using Ink Spread Effect
Author :
Shirali-Shahreza, Sajad ; Manzuri-Shalmani, M.T. ; Shirali-Shahreza, M. Hassan
Author_Institution :
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran
Abstract :
Nowadays, OCR (optical character recognition) is widely used for converting written documents to digital documents. One of the OCR phases is page segmentation. In page segmentation, text regions must be found in input image. In addition, text parts like text columns must be separated. In this paper, a new method for segmenting Persian/Arabic printed text is proposed. This method is based on ink spread effect idea, a new idea that has particular features. Main features of Persian/Arabic scripts are considered in designing this method. This method is skew resistant and can segment text within frames and tables or regions with gray background
Keywords :
document image processing; optical character recognition; text analysis; Persian/Arabic printed text; document conversion; ink spread effect; optical character recognition; page segmentation; Character recognition; Design methodology; Image converters; Image processing; Image segmentation; Ink; Natural languages; Optical character recognition software; Optical computing; Pattern recognition; Image Processing; OCR; Page Segmentation; Pattern Recognition; Persian/Arabic Document;
Conference_Titel :
SICE-ICASE, 2006. International Joint Conference
Conference_Location :
Busan
Print_ISBN :
89-950038-4-7
Electronic_ISBN :
89-950038-5-5
DOI :
10.1109/SICE.2006.315618