• DocumentCode
    2734102
  • Title

    Robust binarization of degraded documents using adaptive-cum-interpolative thresholding in a multi-scale framework

  • Author

    Bag, Soumen ; Bhowmick, Partha ; Behera, Priyaranjan ; Harit, Gaurav

  • Author_Institution
    Comput. Sc. & Eng. Dept., IIT Kharagpur, Kharagpur, India
  • fYear
    2011
  • fDate
    3-5 Nov. 2011
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    A novel technique for binarization of degraded documents is proposed. It works in a multi-scale framework with an adaptive-cum-interpolative thresholding as a modification of Otsu´s method. Instead of computing a global threshold value for an input document image, it computes the local threshold values for a small set of grid points by observing the intensity pattern of the pixels lying in the concerned grid cells. Thresholds estimated for these grid points are used, in turn, to compute the threshold values of all the remaining pixels using a fast-yet-efficient interpolation procedure. To handle noises in degraded images, this grid-based adaptive thresholding is applied in successively reducing scales to obtain the nearoptimal binarization as a set of connected components. After a post-processing with these connected components, we get the final output. Exhaustive experimentation has been carried out with benchmark datasets including George Washington corpus of handwritten documents, and also with our own datasets. When compared to other methods, the proposed method is found to be robust and appreciably better, as tested by conventional evaluation schemes.
  • Keywords
    document image processing; handwriting recognition; image segmentation; interpolation; visual databases; George Washington corpus; Otsu method; adaptive-cum-interpolative thresholding; degraded document binarization; degraded image noise handling; document image; fast-yet-efflcient interpolation; grid cells; grid points; grid-based adaptive thresholding; handwritten documents; multiscale framework; near-optimal binarization; pixels intensity pattern; robust binarization; Adaptation models; Character recognition; Information processing; Interpolation; Measurement; PSNR; Adaptive thresholding; Degraded documents; Document image binarization; Grid-based approach; Multi-scale framework;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Information Processing (ICIIP), 2011 International Conference on
  • Conference_Location
    Himachal Pradesh
  • Print_ISBN
    978-1-61284-859-4
  • Type

    conf

  • DOI
    10.1109/ICIIP.2011.6108912
  • Filename
    6108912