DocumentCode :
1398228
Title :
Rotation invariant texture features and their use in automatic script identification
Author :
Tan, T.N.
Author_Institution :
Inst. of Autom., Acad. Sinica, Beijing, China
Volume :
20
Issue :
7
fYear :
1998
fDate :
7/1/1998 12:00:00 AM
Firstpage :
751
Lastpage :
756
Abstract :
Concerns the extraction of rotation invariant texture features and the use of such features in script identification from document images. Rotation invariant texture features are computed based on an extension of the popular multi-channel Gabor filtering technique, and their effectiveness is tested with 300 randomly rotated samples of 15 Brodatz textures. These features are then used in an attempt to solve a practical but hitherto mostly overlooked problem in document image processing-the identification of the script of a machine printed document. Automatic script and language recognition is an essential front-end process for the efficient and correct use of OCR and language translation products in a multilingual environment. Six languages (Chinese, English, Greek, Russian, Persian, and Malayalam) are chosen to demonstrate the potential of such a texture-based approach in script identification.
Keywords :
document image processing; feature extraction; image texture; language translation; optical character recognition; spatial filters; Brodatz textures; Chinese; English; Greek; Malayalam; Persian; Russian; automatic script identification; document images; language recognition; language translation; machine printed document; multi-channel Gabor filtering technique; multilingual environment; rotation invariant texture features; texture-based approach; Energy measurement; Feature extraction; Filtering; Gabor filters; Humans; Image recognition; Image texture analysis; Natural languages; Optical character recognition software; Testing;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/34.689305
Filename :
689305
Link To Document :
بازگشت