• DocumentCode
    1696228
  • Title

    Language model adaptation for video lectures transcription

  • Author

    Martinez-Villaronga, Adria ; del Agua, Miguel A. ; Andres-Ferrer, Jesus ; Juan, Alfons

  • Author_Institution
    PRHLT, Univ. Politec. de Valencia (UPV), València, Spain
  • fYear
    2013
  • Firstpage
    8450
  • Lastpage
    8454
  • Abstract
    Videolectures are currently being digitised all over the world for its enormous value as reference resource. Many of these lectures are accompanied with slides. The slides offer a great opportunity for improving ASR systems performance. We propose a simple yet powerful extension to the linear interpolation of language models for adapting language models with slide information. Two types of slides are considered, correct slides, and slides automatic extracted from the videos with OCR. Furthermore, we compare both time aligned and unaligned slides. Results report an improvement of up to 3.8 % absolute WER points when using correct slides. Surprisingly, when using automatic slides obtained with poor OCR quality, the ASR system still improves up to 2.2 absolute WER points.
  • Keywords
    interpolation; multimedia systems; optical character recognition; video signal processing; ASR systems performance; OCR quality; absolute WER points; automatic extracted slides; correct slides; language model adaptation; language models; linear interpolation; reference resource; slide information; time aligned slides; unaligned slides; video lectures transcription; Adaptation models; Computational modeling; Hidden Markov models; Interpolation; Mathematical model; Optical character recognition software; Vocabulary; language model adaptation; video lectures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639314
  • Filename
    6639314