Title :
Anatomy of a form reader
Author :
Lam, Stephen W. ; Javanbakht, Ladan ; Srihari, Sargur N.
Author_Institution :
Center for Excellence for Document Analysis & Recognition, State Univ. of New York, Buffalo, NY, USA
Abstract :
Forms are used extensively in today´s offices. The task of an automated form reader is to locate data filled on a form and to encode the content into appropriate symbolic descriptions. The challenges in form reading are due to high volume and large variety. A robust form reader with high adaptability and trainability. The form reader consists of two modules: field registration and data recognition module. The field registration module acquires knowledge about the forms of interest and the data recognition module recognizes text data on filled forms using the acquired knowledge. The capability of the reader increases progressively through supervised learning. The form reader has been training to read a large variety of forms with machine-printed data. The adaptability and trainability of the system have been demonstrated through the experiments
Keywords :
business forms; knowledge acquisition; learning (artificial intelligence); word processing; adaptability; automated form reader; data recognition module; field registration; filled forms; form reading; high adaptability; machine-printed data; offices; robust form reader; supervised learning; symbolic descriptions; text data; trainability; Anatomy; Data mining; Data processing; Databases; Detectors; Java; Robustness; Supervised learning; Text analysis; Text recognition;
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
DOI :
10.1109/ICDAR.1993.395685