Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition

Author

Nanjo, Hiroaki ; Kawahara, Tatsuya

Author_Institution

School of Informatics, Kyoto University, Sakyo-ku, 606-8501, Japan

Volume

fYear

2002

fDate

13-17 May 2002

Abstract

This paper addresses the problem of speaking rate in large vocabulary spontaneous speech recognition. In spontaneous lecture speech, the speaking rate is generally fast and may vary a lot within a talk. We also observed different error tendencies for fast and slow speech segments. Therefore, we first present a speaking-rate dependent decoding strategy that applies the most adequate acoustic analysis, phone models and decoding parameters according to the speaking rate. Several methods are investigated and their selective application leads to accuracy improvement. We also propose to make use of speaking-rate information in speaker adaptation, in which the different adapted models are set up for fast and slow utterances. It is confirmed that the method is more effective than normal adaptation.

Keywords

Computational modeling; Three dimensional displays;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location

Orlando, FL, USA

ISSN

1520-6149

Print_ISBN

0-7803-7402-9

Type

conf

DOI

10.1109/ICASSP.2002.5743820

Filename

5743820

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=542289