DocumentCode
3489551
Title
Multi-modal Information Integration for Document Retrieval
Author
Hassan, Ehtesham ; Chaudhury, Santanu ; Gopal, M.
Author_Institution
Dept. of Electr. Eng., Indian Inst. of Technol., Delhi, New Delhi, India
fYear
2013
fDate
25-28 Aug. 2013
Firstpage
1200
Lastpage
1204
Abstract
The paper proposes a novel multi-modal document image retrieval framework by exploiting the information of text and graphics regions. The framework applies multiple kernel learning based hashing formulation for generation of composite document indexes using different modalities. The existing multimedia management methods for imaged text documents have not addressed the requirement of old and degraded documents. In the subsequent contribution, we propose novel multi-modal document indexing framework for retrieval of old and degraded text documents by combining OCR´ed text and image based representation using learning. The evaluation of proposed concepts is demonstrated on sampled magazine cover pages, and documents of Devanagari script.
Keywords
image representation; image retrieval; indexing; learning (artificial intelligence); multimedia systems; optical character recognition; text analysis; Devanagari script documents; OCR; composite document index generation; degraded text document retrieval; graphics region; image based representation; imaged text documents; magazine cover pages; multimedia management methods; multimodal document image retrieval framework; multimodal document indexing framework; multimodal information integration; multiple kernel learning based hashing formulation; old text document retrieval; text region; Graphics; Image segmentation; Indexing; Kernel; Multimedia communication; Optimization; Semantics; Document Indexing; Multi-modal Retrieval; Multiple Kernel Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location
Washington, DC
ISSN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2013.243
Filename
6628804
Link To Document