مرکز منطقه ای اطلاع رساني علوم و فناوري - Multiple time-span feature fusion for deep neural network modeling

DocumentCode :

134321

Title :

Multiple time-span feature fusion for deep neural network modeling

Author :

Chongjia Ni ; Chen, Nancy F. ; Bin Ma

Author_Institution :

Inst. for Infocomm Res., A*STAR, Singapore, Singapore

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

138

Lastpage :

142

Abstract :

In this paper, we exploit long term information from multiple time-spans for automatic speech recognition. The multiple time-span information is encoded into three different feature streams: speaker-adaptation-transformed features, deep bottleneck features and deep hierarchical bottleneck features. By combining three different time-spans in discriminative acoustic modeling, the character/syllable error rate improves for Mandarin and Vietnamese conversational telephone speech recognition. We obtain 0.8% and 1.9% absolute over DNN-HMM baselines in character error rate and syllable error rate for Mandarin and Vietnamese, respectively. Further analysis also suggests that our proposed feature fusion approach is able to encode finer-grain temporal information than directly using input features of long time-spans in DNN-HMM baselines.

Keywords :

acoustic signal processing; error statistics; feature extraction; hidden Markov models; natural language processing; neural nets; sensor fusion; speech recognition; DNN-HMM baselines; Mandarin language; Vietnamese language; automatic speech recognition; character error rate; conversational telephone speech recognition; deep hierarchical bottleneck features; deep neural network modeling; discriminative acoustic modeling; feature streams; hidden Markov model; long term information; multiple time-span feature fusion; multiple time-span information; speaker-adaptation-transformed features; syllable error rate; Acoustics; Feature extraction; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; Feature representation; Hidden Markov model (HMM); deep bottleneck; deep hierarchical bottleneck; deep neural network (DNN);

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936707

Filename :

6936707

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134321