مرکز منطقه ای اطلاع رساني علوم و فناوري - Efficient detection of abnormalities in large OCR databases

DocumentCode :

2195132

Title :

Efficient detection of abnormalities in large OCR databases

Author :

HA, Thien M.

Author_Institution :

Inst. of Comput. Sci. & Appl. Math., Berne Univ., Switzerland

Volume :

fYear :

1997

fDate :

18-20 Aug 1997

Firstpage :

1006

Abstract :

Building large optical character recognition (OCR) databases is time-consuming and tedious. Moreover, the process is error-prone due to the difficulty in segmentation and the uncertainty in labelling. When the database is very large, say one million patterns, human errors due to fatigue and inattention become a critical factor. This paper discusses one method to alleviate the burden caused by these problems. Specifically, the method allows an automatic detection of abnormalities, e.g. mislabelling, and thus may contribute to clean up a labelled database. The method is based on the optimum class-selective rejection rule. As a test case, the method is applied to the NIST databases containing nearly 300,000 handwritten numerals

Keywords :

data integrity; document image processing; errors; handwriting recognition; image segmentation; optical character recognition; very large databases; visual databases; NIST databases; database abnormality detection; error-prone; fatigue; handwriting style; handwritten numerals; human errors; inattention; labelled database; labelling; large OCR databases; optical character recognition databases; optimum class-selective rejection rule; segmentation; time-consuming; uncertainty; Character recognition; Databases; Error analysis; Error correction; Labeling; NIST; Optical character recognition software; Pattern recognition; Probability; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on

Conference_Location :

Ulm

Print_ISBN :

0-8186-7898-4

Type :

conf

DOI :

10.1109/ICDAR.1997.620661

Filename :

620661

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2195132