Statistical modeling of co-articulation in continuous speech based on data driven interpolation

Author

Sun, Don X.

Author_Institution

Stat. & Inf. Anal. Res., Bell Labs. Lucent Technol., Murray Hill, NJ, USA

Volume

3

fYear

1997

fDate

21-24 Apr 1997

Firstpage

1751

Abstract

Parsimonious modeling of the context dependency nature of speech due to co-articulation is very important for improving the performance of speech recognition systems. Numerous approaches have been proposed in the literature to address this problem. However, most of the methods are based on the idea of using context-dependent speech units, which inevitably increases the complexity of the model space. This paper presents a new approach of speech co-articulation modeling with complexity only comparable to context-independent models. In this model, the movement of a sequence of speech signals is characterized by a set of anchor points in the feature vector space that correspond to the target phonemic units. The transitions between the phonemic units due to co-articulation are modeled as interpolations between the target vectors. Two types of parameters are involved in the models: the intrinsic parameters in the models of target units and the auxiliary parameters specifying the transitional units. The auxiliary parameters are estimated “online” for a given sequence of speech feature vectors, hence it does not contribute to the complexity of the models. Unlike “triphone”-type context dependent models, the complexity of this approach is comparable to the context independent phoneme models, yet, some phonetic classification experiments showed that the new model can achieve the same performance as the more complex context dependent models

Keywords

computational complexity; interpolation; speech recognition; statistical analysis; anchor points; auxiliary parameters; co-articulation; complexity; context dependency; context-independent models; continuous speech; data driven interpolation; feature vector space; intrinsic parameter; parsimonious modeling; phonemic units; phonetic classification; sequence; speech recognition; speech signals; statistical modeling; Context modeling; Information analysis; Interpolation; Parameter estimation; Robustness; Speech analysis; Speech recognition; Statistical analysis; Sun; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.598863

Filename

598863