• DocumentCode
    1637430
  • Title

    Generic Feature Selection and Document Processing

  • Author

    Chouaib, H. ; Vincent, N. ; Cloppet, F. ; Tabbone, S.

  • Author_Institution
    Lab. CRIP5(EA 2517), Univ. Paris Descartes, Paris, France
  • fYear
    2009
  • Firstpage
    356
  • Lastpage
    360
  • Abstract
    This paper presents a generic features selection method and its applications on some document analysis problems.The method is based on a genetic algorithm (GA), whose fitness function is defined by combining Adaboot classifiers associated with each feature. Our method is not linked to a classifier achieving the final recognition task; we have used a combination of weak classifiers to evaluate a subset of features. So we select features that can further be used in the most appropriate classifiers.This method has been tested on three applications: dropcaps classification, handwritten digits recognition and text detection. The results show the efficiency and robustness of the proposed approach.
  • Keywords
    document image processing; feature extraction; genetic algorithms; handwritten character recognition; image classification; learning (artificial intelligence); text analysis; Adaboot classifier; GA; document analysis problem; document image processing; dropcaps classification; fitness function; generic feature subset selection method; genetic algorithm; handwritten digit recognition; text detection; Genetic algorithms; Handwriting recognition; Noise shaping; Pattern analysis; Pattern recognition; Robustness; Testing; Text analysis; Text recognition; Training data; Adaboost; Drop caps; Feature selection; genetic Algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4244-4500-4
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2009.200
  • Filename
    5277672