Title :
Combining Machine Learning and Computational Auditory Scene Analysis to Separate Monaural Speech of Two-Talker
Author :
Li, Peng ; Guan, Yong ; Liu, Wenju ; Xu, Bo
Author_Institution :
Digital Media Content Technol. Res. Center, Chinese Acad. of Sci., Beijing
fDate :
Aug. 30 2007-Sept. 1 2007
Abstract :
Monaural speech separation is one of the most difficult problems in speech signal processing. In this paper, a new method based on machine learning and computational auditory scene analysis (CASA) is suggested to separate the monaural speech of two-talker. The technique of machine learning is used to learn the grouping cues on isolated clean data from single speaker. By using a factorial-max vector quantization model (MAXVQ) to infer the masking signals needed in resynthesis, the objective of separation is accomplished. The results of experiment on a standard corpus show that this proposed method could separate the mixed speech of two speakers very well. The SNR of the separated speech are improved obviously.
Keywords :
learning (artificial intelligence); speaker recognition; speech processing; vector quantisation; computational auditory scene analysis; factorial-max vector quantization model; machine learning; monaural speech separation; speech signal processing; Automation; Humans; Image analysis; Machine learning; Pattern recognition; Prototypes; Speech analysis; Speech coding; Speech processing; Timbre;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1610-3
Electronic_ISBN :
978-1-4244-1611-0
DOI :
10.1109/NLPKE.2007.4368044