• DocumentCode
    2472544
  • Title

    Background noise detection and cleaning in document images

  • Author

    Ali, Majdi Ben Hadj

  • Author_Institution
    German Res. Center for Artificial Intelligence GmbH, Germany
  • Volume
    3
  • fYear
    1996
  • fDate
    25-29 Aug 1996
  • Firstpage
    758
  • Abstract
    A digitized binary image containing text which overlaps with background noise or some complex background images is not a perfect input to OCR systems. Most of the OCR systems can recognize only black characters on white uniform background or vice versa. Overlapping text with background noise regions can be directly opened with an appropriate structuring element to remove the background components that touch the characters. But applying such methods globally to a document image will reduce the quality of the “clean” text (i.e. text on uniform white background) and the character recognition accuracy will rapidly decrease. An efficacious and simple approach is to distinguish between the “noisy” text regions where such cleaning and enhancing overhead is needed and the “clean” text regions where an OCR device already yields good recognition results. The author focuses on the topic of detecting noise regions and presents an objective evaluation method. As an example it is used to evaluate a standard noise cleaning method in document images
  • Keywords
    document image processing; image segmentation; optical character recognition; background noise detection; character recognition; digitized binary image; document images; noise cleaning method; noise regions; objective evaluation method; overlapping text; Artificial intelligence; Background noise; Character recognition; Cleaning; Colored noise; Image recognition; Optical character recognition software; Postal services; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1996., Proceedings of the 13th International Conference on
  • Conference_Location
    Vienna
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-7282-X
  • Type

    conf

  • DOI
    10.1109/ICPR.1996.547270
  • Filename
    547270