• DocumentCode
    3179608
  • Title

    Speech retrieval for TV news programs by fusing the audio and video information

  • Author

    Gao, Xinbo ; Li, Jie ; Ji, Hongbing

  • Author_Institution
    Sch. of Electron. Eng., Xidian Univ., Xi´´an, China
  • Volume
    2
  • fYear
    2002
  • fDate
    26-30 Aug. 2002
  • Firstpage
    994
  • Abstract
    A typical news story contains a brief report by the anchor person(s) in the studio, as well as news footage in the field. Investigation shows that our recognizer performs better when indexing audio from the studio than that from the field. In order to automatically extract the "reliable" audio segments for speech retrieval, we attempt to detect studio-to-field transitions by means of video parsing. Our research is based on 146 news stories collected from Hong Kong TVB Jade station. Retrieval using the entire audio track gave (average inverse rank) AIR=0.759 while, with the incorporation of video parsing, we performed retrieval based only on the studio recordings, which produced AIR=0.765.
  • Keywords
    audio signal processing; database indexing; feature extraction; information retrieval; multimedia databases; speech processing; speech recognition; television production; Hong Kong TVB Jade station; TV news programs; audio indexing; audio video information fusion; automatic extraction; reliable audio segments; speech retrieval; studio-to-field transition detection; video parsing; Audio recording; Automatic speech recognition; Hidden Markov models; Indexing; Information retrieval; Speech recognition; TV; Testing; Video on demand; Video recording;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing, 2002 6th International Conference on
  • Print_ISBN
    0-7803-7488-6
  • Type

    conf

  • DOI
    10.1109/ICOSP.2002.1179955
  • Filename
    1179955