Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition

Author

Lin, Shih-Hsiang ; Yeh, Yao-Ming ; Chen, Berlin

Author_Institution

Nat. Taiwan Normal Univ., Taipei

fYear

2007

fDate

9-13 Dec. 2007

Firstpage

87

Lastpage

92

Abstract

The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.

Keywords

polynomials; regression analysis; speech recognition; ASR robustness; current automatic speech recognition systems; distribution characteristics; feature domain; noise effects compensation; noise sources; polynomial regression approach; probability distributions; robust speech recognition; speech features; Acoustic distortion; Automatic speech recognition; Histograms; Noise robustness; Nonlinear distortion; Polynomials; Probability distribution; Speech enhancement; Speech recognition; Uncertainty; clustering; histogram equalization; polynomial regression; robustness; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location

Kyoto

Print_ISBN

978-1-4244-1746-9

Electronic_ISBN

978-1-4244-1746-9

Type

conf

DOI

10.1109/ASRU.2007.4430089

Filename

4430089