Title :
Date field extraction from handwritten documents using HMMs
Author :
Ranju Mandal;Partha Pratim Roy;Umapada Palz;Michael Blumenstein
Author_Institution :
School of Information and Communication Technology, Griffith University, Queensland, Australia
Abstract :
Automatic document interpretation and retrieval is an important task to access handwritten digitized document repositories. In documents, the date is an important field and it has various applications such as date-wise document indexing/retrieval. In this paper a framework has been proposed for automatic date field extraction from handwritten documents. In order to design the system, sliding window-wise Local Gradient Histogram (LGH)-based features and a character-level Hidden Markov Model (HMM)-based approach have been applied for segmentation and recognition. Individual date components such as month-word (month written in word form i.e. January, Jan, etc.), numeral, punctuation and contraction categories are segmented and labelled from a text line. Next, a Histogram of Gradient (HoG)-based features and a Support Vector Machine (SVM)- based classifier have been used to improve the results obtained from the HMM-based recognition system. Subsequently, both numeric and semi-numeric regular expressions of date patterns have been considered for undertaking date pattern extraction in labelled components. The experiments are performed on an English document dataset and the encouraging results obtained from the approach indicate the effectiveness of the proposed system.
Keywords :
"Hidden Markov models","Writing"
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
DOI :
10.1109/ICDAR.2015.7333885