• DocumentCode
    542289
  • Title

    Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition

  • Author

    Nanjo, Hiroaki ; Kawahara, Tatsuya

  • Author_Institution
    School of Informatics, Kyoto University, Sakyo-ku, 606-8501, Japan
  • Volume
    1
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    This paper addresses the problem of speaking rate in large vocabulary spontaneous speech recognition. In spontaneous lecture speech, the speaking rate is generally fast and may vary a lot within a talk. We also observed different error tendencies for fast and slow speech segments. Therefore, we first present a speaking-rate dependent decoding strategy that applies the most adequate acoustic analysis, phone models and decoding parameters according to the speaking rate. Several methods are investigated and their selective application leads to accuracy improvement. We also propose to make use of speaking-rate information in speaker adaptation, in which the different adapted models are set up for fast and slow utterances. It is confirmed that the method is more effective than normal adaptation.
  • Keywords
    Computational modeling; Three dimensional displays;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5743820
  • Filename
    5743820