DocumentCode
2199171
Title
Using Field Interdependence to Improve Correction Performance in a Transducer-Based OCR Post-Processing System
Author
Perez-Cortes, Juan-Carlos ; Llobet, Rafael ; Navarro-Cerdan, J. Ramon ; Arlandis, Joaquim
Author_Institution
Inst. Tecnol. de Inf., Univ. Politec. de Valencia, Valencia, Spain
fYear
2010
fDate
16-18 Nov. 2010
Firstpage
605
Lastpage
610
Abstract
In an automatic handwritten form processing system it is often necessary to use the lexical or linguistic restrictions present in the field contents in order to obtain acceptable recognition rates. Since each field is known to hold a given kind of information (name, address...), a language model can be defined for it. But, often, in a typical form there are fields linked by known relations, like “Street” and “Postal Code” or “Country” and “City”. We have used Weighted Finite-State Transducers (WFSTs) to combine Stochastic Error-Correcting Language Models from different interdependent fields in real handwritten forms and measured the improvements obtained.
Keywords
computational linguistics; error correction; optical character recognition; stochastic processes; OCR; correction performance; field interdependence; language model; linguistic restriction; optical character recognition; stochastic error correction; weighted finite state transducer; Expected Error Rate; Language Model; OCR; Postprocessing; Reject Rate; Threshold;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location
Kolkata
Print_ISBN
978-1-4244-8353-2
Type
conf
DOI
10.1109/ICFHR.2010.99
Filename
5693630
Link To Document