DocumentCode :
3695090
Title :
Machine-readable region identification from partially blurred document images
Author :
Qinwen Wang;Yixue Wang;Chenyang Wang;Jufeng Yang;Tao Li;Kai Wang
Author_Institution :
College of Computer and Control Engineering, Nankai University, Tianjin, China
fYear :
2015
Firstpage :
231
Lastpage :
235
Abstract :
Partial blur sometimes occurs in the document images captured by a camera, which will influence the performance of OCR on the non-blurred text region. A real-time method, named MRRI, is proposed in this paper to identify the machine-readable region from partially blurred document images. Firstly, a reference image is generated by low-pass filtering on the given document image. Secondly, a weight matrix is generated by calculating the structural similarity for each patch. Thirdly, a cost function is minimized to identify the maximum machine-readable region that can be well-recognized by OCR. In experiments, two applications are considered with the identified machine-readable region. On one hand, Tesseract-OCR is used for the word recognition to build index for a given document image. Compared with the results by applying OCR on the whole image, more words are correctly recognized by applying OCR on the identified region. On the other hand, the identified machine-readable region is used to assess the quality of a document image. Compared with other two image quality assessment methods, the machine-readable region based method shows a better performance. Also, MRRI is light and time-saving, which can meet the requirement of real-time applications.
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
Type :
conf
DOI :
10.1109/ICDAR.2015.7333758
Filename :
7333758
Link To Document :
بازگشت