DocumentCode :
3135414
Title :
Word Spotting Based Retrieval of Urdu Handwritten Documents
Author :
Abidi, Abdessalem ; Jamil, Atif ; Siddiqi, Imran ; Khurshid, Kiran
Author_Institution :
Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
fYear :
2012
fDate :
18-20 Sept. 2012
Firstpage :
331
Lastpage :
336
Abstract :
Urdu being one of the most popular languages adopted during different swatches of history has a valuable collection of handwritten scripts in different state libraries of South Asia. Digitizing these collections can serve not only to preserve them but also to make them available to general public. Non existence of an Urdu OCR, however, limits the concept of a digital Urdu library to scanning and manual search of documents only. We present a word spotting based search method for Urdu handwritten text. The text is first segmented into partial words and a set of features is computed from each partial word. The user queries the system using word image. The partial words in the query image are then matched with those in the database and the matched partial words are merged into complete words. The proposed method evaluated on 90 handwritten documents reported encouraging precision and recall rates.
Keywords :
digital libraries; document image processing; handwritten character recognition; information retrieval; natural languages; optical character recognition; Urdu OCR; Urdu handwritten document; Urdu handwritten text; digital Urdu library; handwritten script; partial word; precision rate; query image; recall rate; word image; word spotting based retrieval; Feature extraction; Handwriting recognition; Image segmentation; Indexing; Libraries; Vectors; Partial Words; Run length smoothing alogrithm; Urdu handwritten text detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
Type :
conf
DOI :
10.1109/ICFHR.2012.289
Filename :
6424415
Link To Document :
بازگشت