DocumentCode :
3142690
Title :
Automatic separation of machine-printed and hand-written text lines
Author :
Pal, U. ; Chaudhuri, B.B.
Author_Institution :
Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India
fYear :
1999
fDate :
20-22 Sep 1999
Firstpage :
645
Lastpage :
648
Abstract :
There are many types of documents where machine-printed and hand-written texts appear intermixed. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, it is necessary to separate these two types of text before feeding them to the respective OCR systems. In this paper, we present such a scheme for both Bangla and Devnagari characters. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of about 98.3%
Keywords :
character sets; document image processing; image classification; image segmentation; optical character recognition; Bangla characters; Devnagari characters; OCR systems; accuracy; automatic text line separation; classification scheme; handwritten text lines; machine-printed text lines; optical character recognition; statistical features; structural features; Computer vision; Data mining; Handwriting recognition; Histograms; Image segmentation; Natural languages; Neural networks; Optical character recognition software; Pattern recognition; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
Type :
conf
DOI :
10.1109/ICDAR.1999.791870
Filename :
791870
Link To Document :
بازگشت