• DocumentCode
    2158577
  • Title

    OCR Oriented Binarization Method of Document Image

  • Author

    Yang, You

  • Volume
    4
  • fYear
    2008
  • fDate
    27-30 May 2008
  • Firstpage
    622
  • Lastpage
    625
  • Abstract
    For the gray-level image of government resource document, a linear transform was employed to enhance the image contrast. A spatial filter was applied to eliminate image noise. After this preprocessing, the threshold surface T1 was computed by Bernsen algorithm, and the global threshold T2 was calculated by modified Otsu method. On the basis of T1 and T2, other three thresholds were defined, which include the broken stroke value T3, the average value T4 of neighborhood and the union value T5 between global and local. Then the gray-level image was binarized through the combination of these five values. Our experiments showed that the proposed method using these five thresholds was adaptive to various government documents. By ghost artifacts eliminating and the broken strokes mending, it´s benefit to OCR.
  • Keywords
    Computer science; Digital signal processing; Government; Image segmentation; Laboratories; Mathematics; Optical character recognition software; Pixel; Smoothing methods; Space technology; OCR; binarization; document image;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image and Signal Processing, 2008. CISP '08. Congress on
  • Conference_Location
    Sanya, China
  • Print_ISBN
    978-0-7695-3119-9
  • Type

    conf

  • DOI
    10.1109/CISP.2008.262
  • Filename
    4566727