Title :
Rotation invariant texture features and their use in automatic script identification
Author_Institution :
Inst. of Autom., Acad. Sinica, Beijing, China
fDate :
7/1/1998 12:00:00 AM
Abstract :
Concerns the extraction of rotation invariant texture features and the use of such features in script identification from document images. Rotation invariant texture features are computed based on an extension of the popular multi-channel Gabor filtering technique, and their effectiveness is tested with 300 randomly rotated samples of 15 Brodatz textures. These features are then used in an attempt to solve a practical but hitherto mostly overlooked problem in document image processing-the identification of the script of a machine printed document. Automatic script and language recognition is an essential front-end process for the efficient and correct use of OCR and language translation products in a multilingual environment. Six languages (Chinese, English, Greek, Russian, Persian, and Malayalam) are chosen to demonstrate the potential of such a texture-based approach in script identification.
Keywords :
document image processing; feature extraction; image texture; language translation; optical character recognition; spatial filters; Brodatz textures; Chinese; English; Greek; Malayalam; Persian; Russian; automatic script identification; document images; language recognition; language translation; machine printed document; multi-channel Gabor filtering technique; multilingual environment; rotation invariant texture features; texture-based approach; Energy measurement; Feature extraction; Filtering; Gabor filters; Humans; Image recognition; Image texture analysis; Natural languages; Optical character recognition software; Testing;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on