Robust speaking rate estimation using broad phonetic class recognition

Author

Yuan, Jiahong ; Liberman, Mark

Author_Institution

Univ. of Pennsylvania, Philadelphia, PA, USA

fYear

2010

fDate

14-19 March 2010

Firstpage

4222

Lastpage

4225

Abstract

Robust speaking rate estimation can be useful in automatic speech recognition and speaker identification, and accurate, automatic measures of speaking rate are also relevant for research in linguistics, psychology, and social sciences. In this study we built a broad phonetic class recognizer for speaking rate estimation. We tested the recognizer on a variety of data sets, including laboratory speech, telephone conversations, foreign accented speech, and speech in different languages, and we found that the recognizer´s estimates are robust under these sources of variation. We also found that the acoustic models of the broad phonetic classes are more robust than those of the monophones for syllable detection.

Keywords

estimation theory; natural languages; speaker recognition; speech processing; speech recognition; automatic speech recognition; broad phonetic class recognition; foreign accented speech; laboratory speech; languages; linguistics; monophones; psychology; robust speaking rate estimation; social sciences; speaker identification; telephone conversations; Acoustic signal detection; Automatic speech recognition; Detection algorithms; Frequency; Natural languages; Psychology; Robustness; Speech enhancement; Speech recognition; Testing; Speaking rate estimation; broad phonetic class; robustness; syllable detection;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location

Dallas, TX

ISSN

1520-6149

Print_ISBN

978-1-4244-4295-9

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2010.5495686

Filename

5495686