DocumentCode :
3028622
Title :
Language identification for printed text independent of segmentation
Author :
Wood, Sally L. ; Yao, Xiaozhong ; Krishnamurthi, Kanthimathi ; Dang, Laurence
Author_Institution :
Santa Clara Univ., CA, USA
Volume :
3
fYear :
1995
fDate :
23-26 Oct 1995
Firstpage :
428
Abstract :
This paper presents efficient algorithms for determining the language classification of machine generated documents without requiring the identification of individual characters. Such algorithms may be useful for sorting and routing of facsimile documents as they arrive so that appropriate routing and secondary analysis, which may include OCR, is selected for each document. It may also prove useful as a component of a content addressable document access system. There have been numerous reported efforts which attempt to segment printed documents into homogeneous regions using Hough transforms, hidden Markov models, morphological filtering, and neural networks. However, language identification can be accomplished without explicit segmentation using the less computationally intensive methods described
Keywords :
Hough transforms; content-addressable storage; document image processing; facsimile; filtering theory; image segmentation; mathematical morphology; natural languages; optical character recognition; Hough transforms; content addressable document access system; facsimile documents; hidden Markov models; homogeneous regions; language classification; language identification; machine generated documents; morphological filtering; neural networks; printed documents; printed text; routing; secondary analysis; sorting; Algorithm design and analysis; Character generation; Facsimile; Filtering; Hidden Markov models; Independent component analysis; Neural networks; Optical character recognition software; Routing; Sorting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image Processing, 1995. Proceedings., International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-8186-7310-9
Type :
conf
DOI :
10.1109/ICIP.1995.537663
Filename :
537663
Link To Document :
بازگشت