Author :
Moradi, Mohieddin ; Mozaffari, Saeed ; Orouji, Ali Asghar
Abstract :
Summary from only given. Video text information plays an important role in semantic-based video analysis, indexing, and retrieval. In this paper we proposed a novel text detection approach based on intrinsic characteristics of Farsi text lines, which is more robust to complex backgrounds and various font styles. First, a Gaussian pyramid with two levels is created from input I-frame images. Then, for better usage of edge feature and increasing edges in text areas, corners are extracted from each level of pyramid, based on an initial font size and some heuristics rules. Afterwards, corner histogram analysis is done. By combining appropriate discrete cosine transform (DCT) coefficients, texture intensity picture is created. Input image is divided into some macro blocks from which some features are extracted and fed into support vector machine (SVM) classifier to classify them into text and non-text areas. Finally, the detected candidate text areas undergo some empirical rules to refine text localization stage results. Experimental results demonstrate that the proposed approach can be used as an automatic text detection system, which is robust to font size, font colour, and background complexity.