• DocumentCode
    356798
  • Title

    Data crawlers for simple optical character recognition

  • Author

    Ashlock, Dan

  • Author_Institution
    Dept. of Math. & Complex Adaptive Syst., Iowa State Univ., Ames, IA, USA
  • Volume
    1
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    706
  • Abstract
    Many genetic programming systems have been designed to exploit the use of state information in an indirect fashion. In this article we apply a genetic programming technique that directly incorporates state information to a collection of related optical character recognition tasks. Our recognizers are coded as GP-Automata, finite state machines modified by associating a function, stored as a parse tree, with each state. These functions are called deciders and serve to extract information from a high bandwidth input to drive finite state transitions. The GP-Automata make iterated decisions, requesting additional data in an adaptive fashion. This iterated data processing is a form of “crawling through the data” and so we term the software objects data crawlers. These objects can be thought of as expert systems, produced automatically from data by digital evolution. The states for rules with the deciders supplying the “if” part of these rules. We evolve perfect recognizers for three variations of a character set derived from the set of 4-ominoes
  • Keywords
    expert systems; finite state machines; optical character recognition; trees (mathematics); 4-ominoes; GP-Automata; data crawlers; expert systems; finite state machines; finite state transitions; genetic programming systems; optical character recognition; parse tree; state information; Automata; Bandwidth; Character recognition; Crawlers; Data mining; Evolutionary computation; Feature extraction; Genetics; Mathematics; Optical character recognition software;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2000. Proceedings of the 2000 Congress on
  • Conference_Location
    La Jolla, CA
  • Print_ISBN
    0-7803-6375-2
  • Type

    conf

  • DOI
    10.1109/CEC.2000.870367
  • Filename
    870367