DocumentCode
3028622
Title
Language identification for printed text independent of segmentation
Author
Wood, Sally L. ; Yao, Xiaozhong ; Krishnamurthi, Kanthimathi ; Dang, Laurence
Author_Institution
Santa Clara Univ., CA, USA
Volume
3
fYear
1995
fDate
23-26 Oct 1995
Firstpage
428
Abstract
This paper presents efficient algorithms for determining the language classification of machine generated documents without requiring the identification of individual characters. Such algorithms may be useful for sorting and routing of facsimile documents as they arrive so that appropriate routing and secondary analysis, which may include OCR, is selected for each document. It may also prove useful as a component of a content addressable document access system. There have been numerous reported efforts which attempt to segment printed documents into homogeneous regions using Hough transforms, hidden Markov models, morphological filtering, and neural networks. However, language identification can be accomplished without explicit segmentation using the less computationally intensive methods described
Keywords
Hough transforms; content-addressable storage; document image processing; facsimile; filtering theory; image segmentation; mathematical morphology; natural languages; optical character recognition; Hough transforms; content addressable document access system; facsimile documents; hidden Markov models; homogeneous regions; language classification; language identification; machine generated documents; morphological filtering; neural networks; printed documents; printed text; routing; secondary analysis; sorting; Algorithm design and analysis; Character generation; Facsimile; Filtering; Hidden Markov models; Independent component analysis; Neural networks; Optical character recognition software; Routing; Sorting;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Processing, 1995. Proceedings., International Conference on
Conference_Location
Washington, DC
Print_ISBN
0-8186-7310-9
Type
conf
DOI
10.1109/ICIP.1995.537663
Filename
537663
Link To Document