DocumentCode
2016303
Title
Word image based latent semantic indexing for conceptual querying in document image databases
Author
Banerjee, Sameek ; Harit, Gaurav ; Chaudhury, Santanu
Volume
2
fYear
2007
fDate
23-26 Sept. 2007
Firstpage
1208
Lastpage
1212
Abstract
In this paper we present an application of latent semantic analysis (LSA) for indexing and retrieval of document images with text. The query is specified as a set of word images and the documents which best match with the query representation in the the latent semantic space are retrieved. We show through extensive experiments on a large database that use of LSA for document images provides improvements in retrieval precision as is the case with electronic text documents.
Keywords
document image processing; image retrieval; indexing; conceptual querying; document image databases; document images indexing; document images retrieval; query representation; word image based latent semantic indexing; word images; Character recognition; Image analysis; Image databases; Image retrieval; Image segmentation; Indexing; Information analysis; Information retrieval; Ontologies; Optical character recognition software;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location
Parana
ISSN
1520-5363
Print_ISBN
978-0-7695-2822-9
Type
conf
DOI
10.1109/ICDAR.2007.4377107
Filename
4377107
Link To Document