DocumentCode :
918674
Title :
Evaluation and optimization of perceptually-based ASR front-end
Author :
Junqua, Jean-Claude ; Wakita, Hisashi ; Hermansky, Hynek
Author_Institution :
Matsushita Electric Ind. Co. Ltd., Osaka, Japan
Volume :
1
Issue :
1
fYear :
1993
fDate :
1/1/1993 12:00:00 AM
Firstpage :
39
Lastpage :
48
Abstract :
Several recently proposed automatic speech recognition (ASR) front-ends are experimentally compared in speaker-dependent, speaker-independent (or cross-speaker) recognition. The perceptually based linear predictive (PLP) front-end, with the root-power sums (RPS) distance measure, yields generally the highest accuracies, especially in cross-speaker recognition., It is experimentally shown that one can optimize the system and further improve recognition accuracy for speaker-independent recognition by controlling the distance measure´s sensitivity to spectral peaks and the spectral tilt and by utilizing the speech dynamic features. For a digit vocabulary and five reference templates obtained with a clustering algorithm, the optimization improves recognition accuracy from 97% to 98.1%, with respect to the PL-PRPS front-end
Keywords :
filtering and prediction theory; optimisation; speech recognition; automatic speech recognition; clustering algorithm; cross-speaker recognition; digit vocabulary; distance measure; optimization; perceptually based ASR front end; perceptually based linear predictive front end; recognition accuracy; reference templates; root-power sums; speaker dependent recognition; speaker-independent recognition; spectral peaks; spectral tilt; Automatic speech recognition; Cepstral analysis; Cepstrum; Control systems; Laboratories; Pattern matching; Predictive models; Speech analysis; Speech recognition; Weight measurement;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.221366
Filename :
221366
Link To Document :
بازگشت