DocumentCode :
187687
Title :
Exploring HLDA based transformation for reducing acoustic mismatch in context of children speech recognition
Author :
Kathania, Hemant Kumar ; Shahnawazuddin, S. ; Sinha, Roopak
Author_Institution :
Dept. of Electron. & Commun. Eng., Nat. Inst. of Technol. Sikkim, Ravangla, India
fYear :
2014
fDate :
22-25 July 2014
Firstpage :
1
Lastpage :
5
Abstract :
This work presents novel approaches for reducing acoustic mismatch in case of children´s speech recognition using acoustic models trained on adults´ speech data. In this regard, heteroscedastic linear discriminant analysis (HLDA) based transformation of test data is explored. It is well known that HLDA reduces the dimensionality of the feature parameters and also increases the discrimination among them at the same time. Consequently, unlike cepstral truncation based approach, the affects of pitch mismatch are mitigated without any significant loss in spectral information. This leads to improved system performance as reported in this work. Furthermore, the impact of model retraining after the transformation is also explored. Model retraining after HLDA based transformation is found to result in added improvements. A constrained approach of generating and employing HLDA based transformation is also proposed. In this case, HLDA based transforms are learned using the base MFCC features only and then applied to all the feature dimensions, i.e., base, delta and delta-delta features. The constrained approach is found to be superior than the unconstrained scheme. This work also reports the impact of iterative transform learning approach on the system performance. The iterative approach of learning the HLDA based transform is reported to be better than the non-iterative approach.
Keywords :
speech recognition; HLDA based transformation; HLDA based transforms; MFCC features; acoustic mismatch; acoustic models; adult speech data; cepstral truncation; children speech recognition; delta-delta features; heteroscedastic linear discriminant analysis; iterative transform learning; pitch mismatch; spectral information; system performance; Data models; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; Transforms; Automatic speech recognition; HLDA transformation; acoustic mismatch; children´s speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Communications (SPCOM), 2014 International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4799-4666-2
Type :
conf
DOI :
10.1109/SPCOM.2014.6983999
Filename :
6983999
Link To Document :
بازگشت