• DocumentCode
    2326264
  • Title

    General and domain-specific techniques for detecting and recognizing superimposed text in video

  • Author

    Zhang, DongQing ; Rajendran, Raj Kumar ; Chang, Shih-Fu

  • Author_Institution
    Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
  • Volume
    1
  • fYear
    2002
  • fDate
    2002
  • Abstract
    We have developed generic and domain-specific video algorithms for caption text extraction and recognition in digital video. Our system includes several unique features: for caption box location, we combine the compressed-domain features derived from DCT coefficients and motion vectors. Long-term temporal consistency is employed to enhance localization performance. For character segmentation, we use a single-pass threshold free approach combining classification and projection to address noisy segmentation, text intensity variation, and algorithm complexity. In recognition, we use Zernike moments to achieve more accurate recognition performance. Finally, domain knowledge is explored and a statistical transition graph model is used to enhance recognition of domain-specific characters, such as ball counts and game score of baseball videos. The algorithms achieved real-time speed and significantly improved recognition accuracy. Furthermore, although the experiments were conducted in baseball videos only, these algorithms (except the transition model) are general and can be used in other applications, such as news and films.
  • Keywords
    character recognition; data compression; discrete cosine transforms; image classification; image segmentation; object detection; statistical analysis; transform coding; video coding; DCT coefficients; Zernike moments; ball counts; baseball videos; caption box location; character segmentation; classification; compressed-domain features; domain-specific techniques; domain-specific video algorithms; game score; generic video algorithms; localization performance; long-term temporal consistency; motion vectors; noisy segmentation; projection; recognition accuracy; recognition performance; single-pass threshold free approach; statistical transition graph model; superimposed text detection; superimposed text recognition; text intensity variation; transition model; Character recognition; Discrete cosine transforms; Games; Image recognition; Image retrieval; Indexing; Layout; Statistical distributions; Text recognition; Video compression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing. 2002. Proceedings. 2002 International Conference on
  • ISSN
    1522-4880
  • Print_ISBN
    0-7803-7622-6
  • Type

    conf

  • DOI
    10.1109/ICIP.2002.1038093
  • Filename
    1038093