Automatic recognition of syllabic speech segments using spectral and temporal features

Author

Ruske, Günther

Author_Institution

Technical University of Munich, Federal Republic of Germany

Volume

7

fYear

1982

fDate

30072

Firstpage

550

Lastpage

553

Abstract

An automatic speech recognition system is presented which starts from a demisyllable segmentation of the speech signal. Recognition of these segments is based on a set of spectral and temporal acoustic features which are automatically extracted from LPC-spectra and assembled in one feature vector for each demisyllable. The 24 components of this vector describe formants, formant loci, formant transitions, formant-like "links" for characterization of nasals, liquids or glides, the spectral distribution of fricative noise or bursts (turbulences), and duration of pauses. Preliminary recognition experiments were carried out with feature vectors extracted from a set of 360 German initial demisyllables which represent 45 consonant clusters combined with 8 vowels. When compared with template matching methods, the feature representations yield a drastic reduction in the number of components needed to represent each segment.

Keywords

Acoustic measurements; Assembly; Automatic speech recognition; Feature extraction; Linear predictive coding; Liquids; Signal analysis; Speech analysis; Speech recognition; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '82.

Type

conf

DOI

10.1109/ICASSP.1982.1171663

Filename

1171663