Title :
Shape-DNA: Effective Character Restoration and Enhancement for Arabic Text Documents
Author :
Caner, G. ; Haritaoglu, I.
Author_Institution :
Polar Rain, Inc., San Jose, CA, USA
Abstract :
We present a novel learning-based image restoration and enhancement technique for improving character recognition performance of OCR products for degraded documents or documents/text captured with mobile devices such as camera-phones. The proposed technique is language independent and can simultaneously increase the effective resolution and restore broken characters with artifacts due to image capturing device such as a low quality/low resolution camera, or due to previous pre-processing such as extracting text region from the document image. The proposed technique develops a predictive relationship between high-resolution training images and their low-resolution/degraded counterparts, and exploits this relationship in a probabilistic scheme to generate a high resolution image from a low quality, low-resolution text image. We present a fast and scalable implementation of the proposed character restoration algorithm to improve the text recognition for document/text images captured by mobile phones. Experimental results demonstrate that the system effectively increases OCR performance for documents captured by mobile imaging devices, from levels of 50% to levels of over 80% for non-latin document/scene text images at 120dpi.
Keywords :
feature extraction; image enhancement; image resolution; image restoration; learning (artificial intelligence); optical character recognition; text analysis; OCR products; arabic text document enhancement; camera-phones; character recognition performance; character restoration; document image; image capturing device; image enhancement; image resolution; learning-based image restoration; mobile devices; text region preprocessing; Computational modeling; Degradation; Image resolution; Image restoration; Optical character recognition software; Shape; Training; document image processing;
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-7542-1
DOI :
10.1109/ICPR.2010.506