DocumentCode :
2708918
Title :
Automatic detection of italic, bold and all-capital words in document images
Author :
Chaudhuri, B.B. ; Garain, U.
Author_Institution :
Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India
Volume :
1
fYear :
1998
fDate :
16-20 Aug 1998
Firstpage :
610
Abstract :
We propose simple and fast algorithms for detection of italic, bold and all-capital words without doing actual character recognition. We present a statistical study which reveals that the detection of such words may play a key role in automatic information retrieval from documents. Moreover, detection of italic words can be used to improve the recognition accuracy of a text recognition system. Considerable number of document images have been tested and our algorithms give accurate results on all the tested images, and the algorithms are very easy to implement
Keywords :
document image processing; optical character recognition; OCR; all-capital word detection; automatic word detection; bold word detection; document image processing; italic word detection; text recognition system; Books; Character recognition; Computer vision; Degradation; Information retrieval; Optical character recognition software; Pattern recognition; Software systems; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on
Conference_Location :
Brisbane, Qld.
ISSN :
1051-4651
Print_ISBN :
0-8186-8512-3
Type :
conf
DOI :
10.1109/ICPR.1998.711217
Filename :
711217
Link To Document :
بازگشت