• DocumentCode
    730850
  • Title

    Language model adaptation for academic lectures using character recognition result of presentation slides

  • Author

    Akita, Yuya ; Yizheng Tong ; Kawahara, Tatsuya

  • Author_Institution
    Sch. of Inf., Kyoto Univ., Kyoto, Japan
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5431
  • Lastpage
    5435
  • Abstract
    For automatic speech recognition (ASR) of lectures, texts of presentation slides are expected to be useful for adapting a language model, while slide texts are not always available in a machine-readable form. In this paper, we propose a language model adaptation framework that uses character recognition results of slide images in a lecture video. Since character recognition results contain many errors, we introduce a filtering method based on morphological and topic information. Then we perform linear interpolation of the baseline language model with the filtered results and also relevant texts which are selected automatically from a text database using the filtered results. We further conduct a cache-based adaptation method on the resulting language model, in which keywords in the filtered results are cached and used to boost the word probability. In an experimental evaluation over real lectures, we obtained a significant improvement of ASR performance by this adaptation framework.
  • Keywords
    character recognition; filtering theory; interpolation; speech recognition; academic lectures; automatic speech recognition; baseline language model; cache-based adaptation; character recognition result; filtering method; language model adaptation; lecture video; linear interpolation; machine-readable form; morphological information; presentation slides; slide images; topic information; word probability; Accuracy; Adaptation models; Interpolation; Mathematical model; Optical character recognition software; Speech; Speech recognition; Language model; adaptation; character recognition; lectures; presentation slides;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7179009
  • Filename
    7179009