Title :
Automatic Detection of Handwritten Texts from Video Frames of Lectures
Author :
Banerjee, Prithu ; Bhattacharya, Ujjwal ; Chaudhuri, Bidyut B.
Author_Institution :
Soc. for Natural Language Technol. Res., Kolkata, India
Abstract :
Automatic recognition of handwritten texts in video lectures has important applications. In video lectures, the presenter usually writes on white / colored board. The video camera often captures the writing board along with certain other objects possibly including the presenter itself. Recognition of handwritten texts from such a video frame requires prior detection of the region of texts in the frame. In this article, we present our recent study of text localization in such video lecture frames. Here, we use Scale Invariant Feature Transform (SIFT) descriptors densely over the entire region of the frame. The descriptors are located on a regular grid of 5 pixels following the usual practice and considered a uniform patch size of 60 × 60 pixels as its support on the basis of an empirical study. This SIFT descriptor at each location (grid point) is fed as a 128-dimensional input feature vector to a Multilayer Perceptron (MLP) network which gives response for each grid point as either text or non-text. Depending on certain aggregate response at each pixel we localize text regions in the input video frame. Next, we employ K-means clustering to detect the text components present in the localized region of the video frame. Finally, two simple rules are applied to decide certain possible detected text components as noise. We obtained encouraging simulation results of this approach on a variety of video lecture frames.
Keywords :
document image processing; feature extraction; handwritten character recognition; image sensors; multilayer perceptrons; pattern clustering; transforms; 128- dimensional input feature vector; K-means clustering; MLP; SIFT descriptor; automatic handwritten texts detection; automatic handwritten texts recognition; colored board; multilayer perceptron network; scale invariant feature transform; text components; text localization; uniform patch size; video camera; video lecture frames; white board; writing board; Cameras; Databases; Handwriting recognition; Image color analysis; Noise; Text recognition; Training; Hand-written text localization; K-means algorithm; MLP network; Reading of video camera based white or color board notes; SIFT descriptor;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
Print_ISBN :
978-1-4799-4335-7
DOI :
10.1109/ICFHR.2014.110