• DocumentCode
    304705
  • Title

    Document image decoding approach to character template estimation

  • Author

    Kopec, Gary E. ; Lomelin, Mauricio

  • Author_Institution
    Xerox Palo Alto Res. Center, CA, USA
  • Volume
    1
  • fYear
    1996
  • fDate
    16-19 Sep 1996
  • Firstpage
    213
  • Abstract
    An approach to supervised training of document-specific character templates from sample page images and unaligned transcriptions is presented. The template estimation problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding (DID) framework. This leads to a two-phase iterative training algorithm consisting of transcription alignment and aligned template estimation (ATE) steps. The ATE step is the heart of the algorithm and involves assigning template pixel colors to maximize likelihood while satisfying a template disjointness constraint. In one large-scale experiment, use of document-specific templates resulted in a character error rate that was about an order of magnitude less than that of a commercial omni-font OCR program
  • Keywords
    document image processing; error statistics; iterative methods; learning (artificial intelligence); maximum likelihood decoding; maximum likelihood estimation; optical character recognition; DID; aligned template estimation; character error rate; character template estimation; constrained maximum likelihood parameter estimation; document image decoding; document-specific character templates; document-specific templates; sample page images; supervised training; template disjointness; template pixel colors; transcription alignment; two-phase iterative training algorithm; unaligned transcriptions; Error analysis; Heart; Image recognition; Image segmentation; Iterative algorithms; Iterative decoding; Labeling; Maximum likelihood decoding; Maximum likelihood estimation; Parameter estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 1996. Proceedings., International Conference on
  • Conference_Location
    Lausanne
  • Print_ISBN
    0-7803-3259-8
  • Type

    conf

  • DOI
    10.1109/ICIP.1996.560730
  • Filename
    560730