DocumentCode :
730085
Title :
A pairwise algorithm for pitch estimation and speech separation using deep stacking network
Author :
Hui Zhang ; Xueliang Zhang ; Shuai Nie ; Guanglai Gao ; Wenju Liu
Author_Institution :
Comput. Sci. Dept., Inner Mongolia Univ., Hohhot, China
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
246
Lastpage :
250
Abstract :
Pitch information is an important cue for speech separation. However, pitch estimation in noisy condition is also a task as challenging as speech separation. In this paper, we propose a supervised learning architecture which combines these two problems concisely. The proposed algorithm is based on deep stacking network (DSN) which provides a method of stacking simple processing modules in building deep architecture. In the training stage, an ideal binary mask is used as target. The input vector includes the outputs of lower module and frame-level features which consist of spectral and pitch-based features. In the testing stage, each module provides an estimated binary mask which is employed to re-estimate pitch. Then we update the pitch-based features to the next module. This procedure is embedded iteratively in DSN, and we obtain the final separation results from the last module of DSN. Systematic evaluations show that the proposed approach produces high quality estimated binary mask and outperforms recent systems in generalization.
Keywords :
learning (artificial intelligence); speech processing; binary mask; deep stacking network; pairwise algorithm; pitch estimation; speech separation; supervised learning architecture; Noise; Speech; Testing; Training; Computational auditory scene analysis; Pitch estimation; Speech separation; Supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7177969
Filename :
7177969
Link To Document :
بازگشت