• DocumentCode
    2199171
  • Title

    Using Field Interdependence to Improve Correction Performance in a Transducer-Based OCR Post-Processing System

  • Author

    Perez-Cortes, Juan-Carlos ; Llobet, Rafael ; Navarro-Cerdan, J. Ramon ; Arlandis, Joaquim

  • Author_Institution
    Inst. Tecnol. de Inf., Univ. Politec. de Valencia, Valencia, Spain
  • fYear
    2010
  • fDate
    16-18 Nov. 2010
  • Firstpage
    605
  • Lastpage
    610
  • Abstract
    In an automatic handwritten form processing system it is often necessary to use the lexical or linguistic restrictions present in the field contents in order to obtain acceptable recognition rates. Since each field is known to hold a given kind of information (name, address...), a language model can be defined for it. But, often, in a typical form there are fields linked by known relations, like “Street” and “Postal Code” or “Country” and “City”. We have used Weighted Finite-State Transducers (WFSTs) to combine Stochastic Error-Correcting Language Models from different interdependent fields in real handwritten forms and measured the improvements obtained.
  • Keywords
    computational linguistics; error correction; optical character recognition; stochastic processes; OCR; correction performance; field interdependence; language model; linguistic restriction; optical character recognition; stochastic error correction; weighted finite state transducer; Expected Error Rate; Language Model; OCR; Postprocessing; Reject Rate; Threshold;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
  • Conference_Location
    Kolkata
  • Print_ISBN
    978-1-4244-8353-2
  • Type

    conf

  • DOI
    10.1109/ICFHR.2010.99
  • Filename
    5693630