Multi-frame combination for robust videotext recognition

Author

Prasad, Rohit ; Saleem, Shirin ; Macrostie, Ehry ; Natarajan, Prem ; Decerbo, Michael

Author_Institution

BBN Technol., Cambridge, MA

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

1357

Lastpage

1360

Abstract

Optical character recognition (OCR) of overlaid text in video streams is a challenging problem due to various factors including the presence of dynamic backgrounds, color, and low resolution. In video feeds such as Broadcast News, a particular overlaid text region usually persists for multiple frames during which the background may or may not vary. In this paper we explore two innovative techniques that exploit such multi-frame persistence of videotext. The first technique uses multiple instances to generate a single enhanced image for recognition. The second technique uses the NIST ROVER algorithm developed for speech recognition to combine 1-best hypotheses from different frames of a text region. Significant improvement in the word error rate (WER) is obtained by using ROVER when compared to recognizing a single instance. The WER is further reduced by combining hypotheses from frame instances, which were generated using character models trained with different binarization thresholds. A 20% relative reduction in the WER was achieved for multi-frame combination over decoding a single frame instance.

Keywords

optical character recognition; text analysis; video signal processing; 1-best hypotheses; multiframe combination; multiframe persistence; optical character recognition; overlaid text; robust videotext recognition; video streams; word error rate; Broadcasting; Character recognition; Feeds; Image recognition; Multimedia communication; NIST; Optical character recognition software; Robustness; Speech recognition; Streaming media; Hidden Markov Models; Optical Character Recognition; Videotext;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4517870

Filename

4517870