General and domain-specific techniques for detecting and recognizing superimposed text in video

Author

Zhang, DongQing ; Rajendran, Raj Kumar ; Chang, Shih-Fu

Author_Institution

Dept. of Electr. Eng., Columbia Univ., New York, NY, USA

Volume

1

fYear

2002

fDate

2002

Abstract

We have developed generic and domain-specific video algorithms for caption text extraction and recognition in digital video. Our system includes several unique features: for caption box location, we combine the compressed-domain features derived from DCT coefficients and motion vectors. Long-term temporal consistency is employed to enhance localization performance. For character segmentation, we use a single-pass threshold free approach combining classification and projection to address noisy segmentation, text intensity variation, and algorithm complexity. In recognition, we use Zernike moments to achieve more accurate recognition performance. Finally, domain knowledge is explored and a statistical transition graph model is used to enhance recognition of domain-specific characters, such as ball counts and game score of baseball videos. The algorithms achieved real-time speed and significantly improved recognition accuracy. Furthermore, although the experiments were conducted in baseball videos only, these algorithms (except the transition model) are general and can be used in other applications, such as news and films.

Keywords

character recognition; data compression; discrete cosine transforms; image classification; image segmentation; object detection; statistical analysis; transform coding; video coding; DCT coefficients; Zernike moments; ball counts; baseball videos; caption box location; character segmentation; classification; compressed-domain features; domain-specific techniques; domain-specific video algorithms; game score; generic video algorithms; localization performance; long-term temporal consistency; motion vectors; noisy segmentation; projection; recognition accuracy; recognition performance; single-pass threshold free approach; statistical transition graph model; superimposed text detection; superimposed text recognition; text intensity variation; transition model; Character recognition; Discrete cosine transforms; Games; Image recognition; Image retrieval; Indexing; Layout; Statistical distributions; Text recognition; Video compression;

fLanguage

English

Publisher

ieee

Conference_Titel

Image Processing. 2002. Proceedings. 2002 International Conference on

ISSN

1522-4880

Print_ISBN

0-7803-7622-6

Type

conf

DOI

10.1109/ICIP.2002.1038093

Filename

1038093