DocumentCode
1492883
Title
Automatic script identification from document images using cluster-based templates
Author
Hochberg, Judith ; Kelly, Patrick ; Thomas, Timothy ; Kerns, Lila
Author_Institution
Los Alamos Nat. Lab., NM, USA
Volume
19
Issue
2
fYear
1997
fDate
2/1/1997 12:00:00 AM
Firstpage
176
Lastpage
181
Abstract
We describe an automated script identification system for typeset document images. Templates for each script are created by clustering textual symbols from a training set. Symbols from new images are compared to the templates to find the best script. Our current system processes thirteen scripts with minimal preprocessing and high accuracy
Keywords
optical character recognition; automatic script identification; cluster-based templates; document images; textual symbol clustering; typeset document images; Character recognition; Image analysis; Indexing; Laboratories; Natural languages; Optical character recognition software; Postal services; Shape; Text analysis; Typesetting;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/34.574802
Filename
574802
Link To Document