Title :
Multi-stream spectro-temporal and cepstral features based on data-driven hierarchical phoneme clusters
Author :
Li, Shang-wen ; Sun, Liang-Che ; Lee, Lin-shan
Author_Institution :
Grad. Inst. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan
Abstract :
We propose a method to enhance multi-stream Gabor and MFCC features using data-driven hierarchical phoneme clusters to yield more discriminating posteriors. We take into account different hierarchy structures, and in addition perform mean and variance normalization. A relative improvement of 11.5% over the conventional MFCC Tandem system was achieved in experiments conducted on Mandarin broadcast news. We analyze the complementarity between Gabor and MFCC features for different types of phonemes, and investigate the benefits that come from using hierarchical phoneme clusters.
Keywords :
speech recognition; MFCC features; automatic speech recognition; data-driven hierarchical phoneme clusters; multistream Gabor features; multistream spectro-temporal features; Gabor filters; LVCSR; clustered hierarchical MLP; spectro-temporal features;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947528