Title :
Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm
Author :
Syu-Siang Wang;Hsin-Te Hwang;Ying-Hui Lai;Yu Tsao;Xugang Lu;Hsin-Min Wang;Borching Su
Author_Institution :
Graduate Institute of Communication Engineering, National Taiwan University, Taiwan
Abstract :
This paper investigates the use of the speech parameter generation (SPG) algorithm, which has been successfully adopted in deep neural network (DNN)-based voice conversion (VC) and speech synthesis (SS), for incorporating temporal information to improve the deep denoising auto-encoder (DDAE)-based speech enhancement. In our previous studies, we have confirmed that DDAE could effectively suppress noise components from noise corrupted speech. However, because DDAE converts speech in a frame by frame manner, the enhanced speech shows some level of discontinuity even though context features are used as input to the DDAE. To handle this issue, this study proposes using the SPG algorithm as a post-processor to transform the DDAE processed feature sequence to one with a smoothed trajectory. Two types of temporal information with SPG are investigated in this study: static-dynamic and context features. Experimental results show that the SPG with context features outperforms the SPG with static-dynamic features and the baseline system, which considers context features without SPG, in terms of standardized objective tests in different noise types and SNRs.
Keywords :
"Speech","Speech enhancement","Context","Noise measurement","Training","Hidden Markov models","Context modeling"
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
DOI :
10.1109/APSIPA.2015.7415295