DocumentCode :
1742148
Title :
Image-based document vectors for text retrieval
Author :
Yu, Zhaohui ; Tan, Chew Lim
Author_Institution :
Sch. of Comput., Nat. Univ. of Singapore, Singapore
Volume :
4
fYear :
2000
fDate :
2000
Firstpage :
393
Abstract :
We propose a method for constructing a vector for a document image to represent its content to facilitate text retrieval. The method is based on an N-Gram algorithm for text similarity measure based on the frequency of occurrence of n-character strings appearing in the electronic text. Instead of using ASCII values, the present study investigates the use of character images to obtain the document vector and has found promising results for use in our news article retrieval project
Keywords :
document image processing; information retrieval; string matching; vectors; N-Gram algorithm; character images; character strings; image representation; image-based document vectors; news article retrieval project; text retrieval; text similarity measure; vector construction; Character recognition; Content based retrieval; Feature extraction; Image recognition; Image retrieval; Performance analysis; Position measurement; Shape; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
ISSN :
1051-4651
Print_ISBN :
0-7695-0750-6
Type :
conf
DOI :
10.1109/ICPR.2000.902941
Filename :
902941
Link To Document :
بازگشت