• DocumentCode
    1758638
  • Title

    Influence of Color-to-Gray Conversion on the Performance of Document Image Binarization: Toward a Novel Optimization Problem

  • Author

    Hedjam, Rachid ; Nafchi, Hossein Ziaei ; Kalacska, Margaret ; Cheriet, Mohamed

  • Author_Institution
    Dept. of GeographyRemote Sensing Lab., McGill Univ., Montreal, QC, Canada
  • Volume
    24
  • Issue
    11
  • fYear
    2015
  • fDate
    Nov. 2015
  • Firstpage
    3637
  • Lastpage
    3651
  • Abstract
    This paper presents a novel preprocessing method of color-to-gray document image conversion. In contrast to the conventional methods designed for natural images that aim to preserve the contrast between different classes in the converted gray image, the proposed conversion method reduces as much as possible the contrast (i.e., intensity variance) within the text class. It is based on learning a linear filter from a predefined data set of text and background pixels that: 1) when applied to background pixels, minimizes the output response and 2) when applied to text pixels, maximizes the output response, while minimizing the intensity variance within the text class. Our proposed method (called learning-based color-to-gray) is conceived to be used as preprocessing for document image binarization. A data set of 46 historical document images is created and used to evaluate subjectively and objectively the proposed method. The method demonstrates drastically its effectiveness and impact on the performance of state-of-the-art binarization methods. Four other Web-based image data sets are created to evaluate the scalability of the proposed method.
  • Keywords
    document image processing; image colour analysis; linear phase filters; optimisation; Web-based image data sets; background pixels; color-to-gray document image conversion; document image binarization performance; intensity variance minimization; linear filter; optimization problem; text pixels; Color; Degradation; Gray-scale; Image color analysis; Image edge detection; Optimization; Training; Color-to-gray image conversion; Document image binarization; Historical document analysis; Linear filter optimization; color-to-gray image conversion; historical document analysis; linear filter optimization;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2015.2442923
  • Filename
    7120120