Title :
Supervised and semi-supervised suppression of background music in monaural speech recordings
Author :
Weninger, Felix ; Feliu, Jordi ; Schuller, Björn
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, München, Germany
Abstract :
In this paper, we propose a semi-supervised algorithm based on sparse non-negative matrix factorization (NMF) to improve separation of speech from background music in monaural signals. In our approach, fixed speech basis vectors are obtained from training data whereas music bases are estimated on-the-fly to cope with spectral variability while preserving small NMF dimensionality for decreased computation effort. In a large-scale experimental evaluation with 168 speakers from the TIMIT database, we compare the semi-supervised method to supervised NMF with an explicit background music model. Our results reveal that the semi-supervised method outperforms supervised NMF at low speech-to-music ratios, and that sparsity constraints on the music spectra to enforce harmonicity can improve separation performance.
Keywords :
matrix decomposition; speech enhancement; TIMIT database; explicit background music model; fixed speech basis vector; harmonicity; monaural signal; monaural speech recording; semisupervised algorithm; semisupervised method; semisupervised suppression; sparse nonnegative matrix factorization; sparsity constraints; speech to music ratio; supervised NMF; training data; Databases; Discrete Fourier transforms; Multiple signal classification; Music; Spectrogram; Speech; Speech processing; non-negative matrix factorization; sparse coding; speech enhancement; supervised source separation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6287817