• DocumentCode
    2142561
  • Title

    Automatic Estimation of the Legibility of Binarised Historic Documents for Unsupervised Parameter Tuning

  • Author

    Stommel, M. ; Frieder, G.

  • Author_Institution
    Artificial Intell. Group, Univ. of Bremen, Bremen, Germany
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    104
  • Lastpage
    108
  • Abstract
    Document enhancement tools are a valuable help in the study of historic documents. Given proper filter settings, many effects that impair the legibility can be evened out (e.g. washed out ink, stained and yellowed paper). However, because of differing authors, languages, handwritings, fonts and paper conditions, no single filter parameter set fits all documents. Therefore, the parameters are usually tuned in a time-consuming manual process to every individual document. To simplify this procedure, this paper introduces a classifier for the legibility of an enhanced historic text document. Experiments on the binarisation of a set of documents from 1938 to 1946 show that the classifier can be used to automatically derive robust filter settings for a variety of documents.
  • Keywords
    document image processing; filtering theory; history; image classification; image enhancement; text analysis; document binarisation; document classifier; document enhancement tool; historic documents; historic text document; legibility automatic estimation; robust filter setting; unsupervised parameter tuning; Character recognition; Estimation; Gravity; Noise; Optical character recognition software; Robustness; Text analysis; document enhancement; historic documents; legibility estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.30
  • Filename
    6065285