DocumentCode
183398
Title
Rejecting Both Segmentation and Classification Errors in Handwritten Form Processing
Author
De Stefano, Claudio ; Fontanella, Francesco ; Marcelli, Angelo ; Parziale, Antonio ; Di Freca, Alessandra Scotto
Author_Institution
Dipt. di Ing. Elettr. e dell´Inf. (DIEI), Univ. di Cassino e del Lazio Meridionale, Cassino, Italy
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
569
Lastpage
574
Abstract
The form processing systems commercially available include a verification step during which a human operator verifies the output provided by the system to ensure 100% accuracy. In order to reduce the time and the cost of such a stage, the OCR engine incorporated into the system provides a reliability measure of the classification to be used for implementing a reject option: in this way only rejected samples are passed to the verification stage. Most of the strategies for designing such a reject option consider that the source of classification errors are within the OCR engine. Such an assumption becomes less reasonable as the forms become less structured, as in case when boxes are provided for the entire data field and not only for isolated characters. Under these circumstances, we investigate to which extent the reliability measure provided by an OCR engine designed to deal with boxed isolated characters can be used to detect both segmentation and classification errors. The experimental results, obtained on a large data set of forms currently in use by a large organization, show that the proposed method successfully achieves its aim. It represents a powerful tool for the system manager to plan system enhancement as the volume of forms containing less constrained data fields increases.
Keywords
handwritten character recognition; optical character recognition; OCR engine; classification errors; handwritten form processing systems; human operator; reliability measure; segmentation; system enhancement; verification step; Accuracy; Engines; Optical character recognition software; Optimization; Radio frequency; Reliability; Training; Classification Reliability; OCR; Random Forest; Reject Option;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.101
Filename
6981080
Link To Document