DocumentCode
2148585
Title
Video Script Identification Based on Text Lines
Author
Phan, Trung Quy ; Shivakumara, Palaiahnakote ; Ding, Zhang ; Lu, Shijian ; Tan, Chew Lim
Author_Institution
Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
1240
Lastpage
1244
Abstract
In this paper, we present a new method for video script identification which is essential before choosing an appropriate OCR engine for identifying text lines when a video frame contains more than one language. The input for script identification is the text lines obtained by our text detection method. We extract upper and lower extreme points for each connected component of Canny edges of text lines. The extracted points are connected to study the behavior of upper and lower lines. The direction of each 10-pixel segment of the lines is determined using PCA. The average angle of the segments of the upper and lower lines is computed to study the smoothness and cursiveness of the lines. In addition, to discriminate the scripts accurately, the method divides a text line into five equal zones horizontally to study the smoothness and cursiveness of the upper and lower lines of each zone. We evaluate the method by conducting experiments on different combinations of languages such as English and Chinese, English and Tamil, Chinese and Tamil, and English, Chinese and Tamil.
Keywords
document image processing; natural language processing; object detection; principal component analysis; text analysis; video signal processing; Canny edges; OCR engine; PCA; text detection method; text line identification; video script identification; Feature extraction; Image edge detection; Optical character recognition software; Testing; Text recognition; Cursiveness; Smoothness; Upper and lower points; Video scrpt line identification; Video text line;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.250
Filename
6065508
Link To Document