Title :
Evaluation and optimization of perceptually-based ASR front-end
Author :
Junqua, Jean-Claude ; Wakita, Hisashi ; Hermansky, Hynek
Author_Institution :
Matsushita Electric Ind. Co. Ltd., Osaka, Japan
fDate :
1/1/1993 12:00:00 AM
Abstract :
Several recently proposed automatic speech recognition (ASR) front-ends are experimentally compared in speaker-dependent, speaker-independent (or cross-speaker) recognition. The perceptually based linear predictive (PLP) front-end, with the root-power sums (RPS) distance measure, yields generally the highest accuracies, especially in cross-speaker recognition., It is experimentally shown that one can optimize the system and further improve recognition accuracy for speaker-independent recognition by controlling the distance measure´s sensitivity to spectral peaks and the spectral tilt and by utilizing the speech dynamic features. For a digit vocabulary and five reference templates obtained with a clustering algorithm, the optimization improves recognition accuracy from 97% to 98.1%, with respect to the PL-PRPS front-end
Keywords :
filtering and prediction theory; optimisation; speech recognition; automatic speech recognition; clustering algorithm; cross-speaker recognition; digit vocabulary; distance measure; optimization; perceptually based ASR front end; perceptually based linear predictive front end; recognition accuracy; reference templates; root-power sums; speaker dependent recognition; speaker-independent recognition; spectral peaks; spectral tilt; Automatic speech recognition; Cepstral analysis; Cepstrum; Control systems; Laboratories; Pattern matching; Predictive models; Speech analysis; Speech recognition; Weight measurement;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on