DocumentCode :
2178350
Title :
Multi-stream spectro-temporal and cepstral features based on data-driven hierarchical phoneme clusters
Author :
Li, Shang-wen ; Sun, Liang-Che ; Lee, Lin-shan
Author_Institution :
Grad. Inst. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5196
Lastpage :
5199
Abstract :
We propose a method to enhance multi-stream Gabor and MFCC features using data-driven hierarchical phoneme clusters to yield more discriminating posteriors. We take into account different hierarchy structures, and in addition perform mean and variance normalization. A relative improvement of 11.5% over the conventional MFCC Tandem system was achieved in experiments conducted on Mandarin broadcast news. We analyze the complementarity between Gabor and MFCC features for different types of phonemes, and investigate the benefits that come from using hierarchical phoneme clusters.
Keywords :
speech recognition; MFCC features; automatic speech recognition; data-driven hierarchical phoneme clusters; multistream Gabor features; multistream spectro-temporal features; Gabor filters; LVCSR; clustered hierarchical MLP; spectro-temporal features;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947528
Filename :
5947528
Link To Document :
بازگشت