DocumentCode :
1437154
Title :
Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
Author :
Yoshioka, Takuya ; Nakatani, Tomohiro ; Miyoshi, Masato ; Okuno, Hiroshi G.
Author_Institution :
NTT Commun. Sci. Labs., Nippon Telegraph & Telephone Corp., Seika, Japan
Volume :
19
Issue :
1
fYear :
2011
Firstpage :
69
Lastpage :
84
Abstract :
This paper proposes a method for performing blind source separation (BSS) and blind dereverberation (BD) at the same time for speech mixtures. In most previous studies, BSS and BD have been investigated separately. The separation performance of conventional BSS methods deteriorates as the reverberation time increases while many existing BD methods rely on the assumption that there is only one sound source in a room. Therefore, it has been difficult to perform both BSS and BD when the reverberation time is long. The proposed method uses a network, in which dereverberation and separation networks are connected in tandem, to estimate source signals. The parameters for the dereverberation network (prediction matrices) and those for the separation network (separation matrices) are jointly optimized. This enables a BD process to take a BSS process into account. The prediction and separation matrices are alternately optimized with each depending on the other; hence, we call the proposed method the conditional separation and dereverberation (CSD) method. Comprehensive evaluation results are reported, where all the speech materials contained in the complete test set of the TIMIT corpus are used. The CSD method improves the signal-to-interference ratio by an average of about 4 dB over the conventional frequency-domain BSS approach for reverberation times of 0.3 and 0.5 s. The direct-to-reverberation ratio is also improved by about 10 dB.
Keywords :
blind source separation; reverberation; speech processing; TIMIT corpus; blind dereverberation; blind separation; direct-to-reverberation ratio; joint optimization; reverberation time; signal-to-interference ratio; source signal estimation; speech mixtures; Blind source separation; Higher order statistics; MIMO; Materials testing; Microphones; Optimization methods; Reverberation; Source separation; Speech analysis; Speech processing; Blind source separation (BSS); blind dereverberation (BD); conditional separation and dereverberation (CSD);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2045183
Filename :
5428853
Link To Document :
بازگشت