DocumentCode :
177872
Title :
Text detection and recognition in natural scenes and consumer videos
Author :
Jain, Abhishek ; Xujun Peng ; Xiaodan Zhuang ; Natarajan, Prem ; Huaigu Cao
Author_Institution :
Language & Multimedia Bus. Unit Raytheon BBN Technol., Speech, Cambridge, MA, USA
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
1245
Lastpage :
1249
Abstract :
We propose an end-to-end system for text detection and recognition in natural scenes and consumer videos. Maximally Stable Extremal Regions which are robust to illumination and viewpoint variations are selected as text candidates. Rich shape descriptors such as Histogram of Oriented Gradients, Gabor filter, corners and geometrical features are used to represent the candidates and classified using a support vector machine. Positively labeled candidates serve as anchor regions for word formation. We then group candidate regions based on geometric and color properties to form word boundaries. To speed up the system for practical applications, we use Partial Least Squares approach for dimensionality reduction. The detected words are binarized, filtered and passed to a hidden Markov model based Optical Character Recognition (OCR) system for recognition. We show significant improvement in text detection and recognition tasks over previous approaches on a large consumer video dataset. Furthermore, the event detection system built upon the OCR output of this approach outperformed multiple other OCR-only based submissions in the recently concluded NIST TRECVID 2013 multimedia event detection evaluations.
Keywords :
Gabor filters; computational geometry; filtering theory; hidden Markov models; image classification; image colour analysis; object detection; optical character recognition; regression analysis; support vector machines; text analysis; video signal processing; Gabor filter; NIST TRECVID 2013 multimedia event detection evaluations; OCR system; OCR-only based submissions; color properties; consumer video dataset; dimensionality reduction; geometric properties; geometrical features; hidden Markov model based optical character recognition system; histogram-of-oriented gradients; illumination variation selection; maximally stable extremal regions; natural scenes; partial least squares approach; shape descriptors; support vector machine; text candidates; text detection; text recognition; viewpoint variation selection; Event detection; Feature extraction; Image edge detection; Optical character recognition software; Support vector machines; Text recognition; Videos; Partial Least Squares; consumer video; event detection; text detection and recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6853796
Filename :
6853796
Link To Document :
بازگشت