DocumentCode
2158577
Title
OCR Oriented Binarization Method of Document Image
Author
Yang, You
Volume
4
fYear
2008
fDate
27-30 May 2008
Firstpage
622
Lastpage
625
Abstract
For the gray-level image of government resource document, a linear transform was employed to enhance the image contrast. A spatial filter was applied to eliminate image noise. After this preprocessing, the threshold surface T1 was computed by Bernsen algorithm, and the global threshold T2 was calculated by modified Otsu method. On the basis of T1 and T2, other three thresholds were defined, which include the broken stroke value T3, the average value T4 of neighborhood and the union value T5 between global and local. Then the gray-level image was binarized through the combination of these five values. Our experiments showed that the proposed method using these five thresholds was adaptive to various government documents. By ghost artifacts eliminating and the broken strokes mending, it´s benefit to OCR.
Keywords
Computer science; Digital signal processing; Government; Image segmentation; Laboratories; Mathematics; Optical character recognition software; Pixel; Smoothing methods; Space technology; OCR; binarization; document image;
fLanguage
English
Publisher
ieee
Conference_Titel
Image and Signal Processing, 2008. CISP '08. Congress on
Conference_Location
Sanya, China
Print_ISBN
978-0-7695-3119-9
Type
conf
DOI
10.1109/CISP.2008.262
Filename
4566727
Link To Document