Title :
An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception
Author :
Hermansky, Hynek
Author_Institution :
Speech Technology Laboratory, Santa Barbara, California
Abstract :
An auditory model of speech perception, the Perceptually based linear predictive analysis with Root power sum metric (PLP-RPS), is applied as the front-end of an automatic speech recognizer (ASR). The PLP-RPS front-end is compared with standard linear predictive-cepstral metric (LP-CEP) front-end, and with LP-RPS and PLP-CEP front-ends. The two-spectral-peak models are the most efficient in modeling of linguistic information in speech. Consequently, in speaker-independent ASR, high analysis order front-ends are less effective than low-order front-ends. Synthetic speech is used for front-end evaluation. Some of perceptual inconsistencies of standard LP front-ends are alleviated in PLP front-ends. The PLP-RPS front-end is most sensitive to harmonic structure of speech spectrum. Perceptual experiments indicate similar tendencies in human auditory perception.
Keywords :
Auditory system; Automatic speech recognition; Humans; Laboratories; Natural languages; Power harmonic filters; Predictive models; Speech analysis; Testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
DOI :
10.1109/ICASSP.1987.1169803