Title :
Retrieving Handwriting Styles: A Content Based Approach to Handwritten Document Retrieval
Author :
Bhardwaj, Anurag ; Thomas, Achint Oommen ; Fu, Yun ; Govindaraju, Venu
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. at Buffalo, Amherst, NY, USA
Abstract :
Large scale retrieval of handwritten documents has primarily been focused around searching a query text in the OCR´ed transcription of the document images, which provides a limited view of the complete search process. Recent research advances have led to a number of content based retrieval techniques which expand the search scope to document content level (i.e. image features, meta-information). Based on similar motivations, we propose a new approach to content based retrieval of handwritten document images by retrieving similar handwriting styles corresponding to a handwritten query image. At the core, we formulate this problem as the task of unsupervised writer style classification without the need of any style definitions or grammar. We build upon our previous work in writer style modeling and apply it to learn a style distribution for every handwriting sample in the corpus. Given a query image, all documents are ranked in order of their style distribution similarity. Experimental results conducted on publicly available IAM dataset demonstrate the efficacy of our proposed method over baseline feature based systems.
Keywords :
content-based retrieval; document image processing; handwritten character recognition; image classification; image retrieval; search problems; text analysis; content based retrieval; handwriting style retrieval; handwritten document image retrieval; query text searching; unsupervised writer style classification; Content Based Retrieval; Topic Model; Writer Style Modeling;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4244-8353-2
DOI :
10.1109/ICFHR.2010.48