DocumentCode
2124053
Title
A blind separation algorithm of speech mixtures base on time-frequency masking
Author
Guo, Wei ; Zong, Qingquan
Author_Institution
Comput. Sch., Wuhan Univ., Wuhan, China
fYear
2012
fDate
21-23 April 2012
Firstpage
2258
Lastpage
2261
Abstract
Based on Technology of Time-Frequency Masking, we raise a blind separation algorithm of speech mixtures, which can be used for separating any number of source using only two mixtures. The method is valid when sources are satisfying W-disjoint orthogonal, that is, when the supports of the windowed Fourier transform of the signals in mixture are disjoint. In time-frequency domain, Performance is compared for floating-point and fixed-point implementations. A Weighted K-means clustering algorithm is presented as an alternative to gradient descent methods for peak tracking and demonstrated to achieve excellent performance without adversely affecting computational load. extract the spatial cues of speech signal, which are relative attenuation-delay pairs, then Motivated by the maximum likelihood mixing parameter estimators, we define a power weighted two-dimensional (2-D) histogram constructed from the ratio of the time-frequency representations of the mixtures that is shown to have one peak for each source with peak location corresponding to the relative attenuation and delay mixing parameters. Then, mark the time-frequency binary masking and using this technique separate the source in time-frequency domain. Finally, I-STFT is used to transform the separated source back to time domain and separated the signal. In a word, the proposed algorithm will give a new prospect to the research of blind separation of speech.
Keywords
Fourier transforms; pattern clustering; speech processing; W-disjoint orthogonal; Weighted K-means clustering algorithm; blind separation algorithm; fixed point implementations; floating point implementations; maximum likelihood mixing parameter estimators; spatial cues; speech mixtures; speech signal; time frequency domain; time frequency masking; windowed Fourier transform; Attenuation; Clustering algorithms; Delay; Histograms; Source separation; Speech; Time frequency analysis; Blind Separation; Time-Frequency Masking; W-disjoint orthogonal;
fLanguage
English
Publisher
ieee
Conference_Titel
Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on
Conference_Location
Yichang
Print_ISBN
978-1-4577-1414-6
Type
conf
DOI
10.1109/CECNet.2012.6201885
Filename
6201885
Link To Document