DocumentCode :
1763737
Title :
Semiautomatic Ground Truth Generation for Text Detection and Recognition in Video Images
Author :
Trung Quy Phan ; Shivakumara, Palaiahnakote ; Bhowmick, Sourav S. ; Shimiao Li ; Chew Lim Tan ; Pal, Umapada
Author_Institution :
Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore, Singapore
Volume :
24
Issue :
8
fYear :
2014
fDate :
Aug. 2014
Firstpage :
1277
Lastpage :
1287
Abstract :
Although a large number of methods for video text detection and recognition have been proposed over the past years, it is hard to find the best state-of-the-art method because of nonavailability of standard datasets, ground truth, and common evaluation measures. Therefore, in this paper, we propose a semiautomatic system for ground truth generation for video text detection and recognition, which includes English and Chinese text of different orientation. The system has a facility to allow the user to manually correct the ground truth if the automatic method produces incorrect results. We propose eleven attributes at the word level, namely: line index, word index, coordinate values of bounding box, area, content, script type, orientation information, type of text (caption/scene), condition of text (distortion/distortion free), start frame, and end frame to evaluate the performance of the method. We also introduce a new dataset that consists of 466 video frames collected from TRECVID 2005 and 2006 databases. The video frames in our dataset contain both horizontal texts (278 frames: 181 with English texts and 97 with Chinese texts) and nonhorizontal texts (188 frames: 140 English and 48 Chinese). Furthermore, the performance of the proposed system is compared with existing text detection methods by calculating measures manually and automatically to show usefulness of our semiautomatic system. The ground truth and the semiautomatic system will be released to the public.
Keywords :
natural language processing; optical character recognition; video databases; video signal processing; Chinese video text detection; TRECVID 2005 database; TRECVID 2006 database; bounding box; common evaluation measures; line index; orientation information; semiautomatic ground truth generation; text recognition; video frames; video images; word index; Accuracy; Graphics; Indexes; Optical character recognition software; Standards; Text recognition; Chinese Video text recognition; Chinese video text recognition; Ground truthing; Video text detection; Video text recognition; ground truthing; multioriented video text detection and recognition; video text detection; video text recognition;
fLanguage :
English
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
Publisher :
ieee
ISSN :
1051-8215
Type :
jour
DOI :
10.1109/TCSVT.2014.2305515
Filename :
6739120
Link To Document :
بازگشت