Tone recognition with fractionized models and outlined features

Author

Ye Tian ; Zhou, Jain-Lai ; Chu, Min ; Chang, Eric

Volume

1

fYear

2004

fDate

17-21 May 2004

Abstract

Different feature extraction and tone modeling schemes are investigated on both speaker-dependent and speaker-independent continuous speech databases. Tone recognition features can be classified as detailed features which use the entire F0 curve, and outlined features which capture the main structure of the F0 curve. Tone models of different size, ranging from very simple one-tone-one-model tone models to complex phoneme-dependent tone models, have different abilities to characterize tone. Our experiments show two conclusions. First, the detailed information of the F0 curve is not necessary for tone recognition. The outlined features can, not only reduce the number of parameters, but also improve the accuracy of tone recognition. The proposed subsection average F0 and ΔF0 are shown to be effective outlined features. The second conclusion is that the one-tone-one-model scheme is not sufficient. Building phoneme-dependent tone models can highly improve the recognition accuracy, especially for speaker-independent data. Thus we suggest using fractionized models, trained with the outlined features, for tone recognition.

Keywords

audio databases; feature extraction; speech recognition; continuous speech databases; detailed features; feature extraction; fractionized models; outlined features; phoneme-dependent tone models; speaker-dependent data; speaker-independent data; speech recognition; tone modeling; tone models; tone recognition; Feature extraction; Spatial databases; Speech;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1325933

Filename

1325933