Title :
A family of European page readers
Author :
Baird, Henry S. ; Ilbert, Derrickg ; Ittner, Davidj
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Abstract :
We have demonstrated a high degree of automation in the engineering of complex machine vision systems, by building ten printed-text page readers, each specialized to a European language, at the pace of one language per week. The page readers provide these functions: page layout analysis, polyfont symbol recognition, typographical morphology, lexicon-driven contextual analysis, and Unicode output encoding. The accuracy and speed of the resulting readers are usably high, and can be easily improved if required by comparatively routine enhancements of subsystems. This exercise illustrates the advantages of a research strategy that emphasizes versatility before, but not at the expense of, accuracy and speed
Keywords :
document image processing; European language; European page readers; Unicode output encoding; complex machine vision systems; lexicon-driven contextual analysis; page layout analysis; polyfont symbol recognition; printed-text page readers; typographical morphology; Automation; Computer architecture; Design engineering; Encoding; Machine vision; Morphology; Natural languages; Prototypes; Runtime; System software;
Conference_Titel :
Pattern Recognition, 1994. Vol. 2 - Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on
Conference_Location :
Jerusalem
Print_ISBN :
0-8186-6270-0
DOI :
10.1109/ICPR.1994.577014