A neural architecture for computing acoustic-phonetic invariants

Author

Tsiang, Elaine

Author_Institution

Monowave Corp., Seattle, WA, USA

Volume

2

fYear

1998

fDate

12-15 May 1998

Firstpage

1109

Abstract

The proposed neural architecture consists of an analytic lower net, and a synthetic upper net. This paper focuses on the upper net. The lower net performs a 2D multiresolution wavelet decomposition of an initial spectral representation to yield a multichannel representation of local frequency modulations at multiple scales. From this representation, the upper net synthesizes increasingly complex features, resulting in a set of acoustic observables at the top layer with multiscale context dependence. The upper net also provides for invariance under frequency shifts, dilatations in tone intervals and time intervals, by building these transformations into the architecture. Application of this architecture to the recognition of gross and fine phonetic categories from continuous speech of diverse speakers shows that it provides high accuracy and strong generalization from modest amounts of training data

Keywords

acoustic signal processing; feature extraction; frequency modulation; learning (artificial intelligence); neural net architecture; signal representation; signal resolution; spectral analysis; speech recognition; wavelet transforms; 2D multiresolution wavelet decomposition; acoustic observables; acoustic-phonetic invariants; analytic lower net; complex features synthesis; continuous speech recognition; dilatations; fine phonetic categories; frequency shifts; gross phonetic categories; high accuracy; local frequency modulation; multichannel representation; multiple scales; multiscale context dependence; neural architecture; spectral representation; synthetic upper net; time intervals; tone intervals; training data; Buildings; Computer architecture; Costs; Frequency modulation; Hidden Markov models; Neural networks; Robustness; Spectral analysis; Speech recognition; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.675463

Filename

675463