Title :
Speech enhancement using segmental nonnegative matrix factorization
Author :
Hao-teng Fan ; Jeih-weih Hung ; Xugang Lu ; Syu-Siang Wang ; Yu Tsao
Author_Institution :
Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
Abstract :
The conventional NMF-based speech enhancement algorithm analyzes the magnitude spectrograms of both clean speech and noise in the training data via NMF and estimates a set of spectral basis vectors. These basis vectors are used to span a space to approximate the magnitude spectrogram of the noise-corrupted testing utterances. Finally, the components associated with the clean-speech spectral basis vectors are used to construct the updated magnitude spectrogram, producing an enhanced speech utterance. Considering that the rich spectral-temporal structure may be explored in local frequency and time-varying spectral patches, this study proposes a segmental NMF (SNMF) speech enhancement scheme to improve the conventional frame-wise NMF-based method. Two algorithms are derived to decompose the original nonnegative matrix associated with the magnitude spectrogram; the first algorithm is used in the spectral domain and the second algorithm is used in the temporal domain. When using the decomposition processes, noisy speech signals can be modeled more precisely, and spectrograms regarding the speech part can be constituted more favorably compared with using the conventional NMF-based method. Objective evaluations using perceptual evaluation of speech quality (PESQ) indicate that the proposed SNMF strategy increases the sound quality in noise conditions and outperforms the well-known MMSE log-spectral amplitude (LSA) estimation.
Keywords :
least mean squares methods; matrix decomposition; speech enhancement; MMSE log spectral amplitude estimation; magnitude spectrograms; noisy speech signals; perceptual evaluation of speech quality; segmental nonnegative matrix factorization; spectral basis vectors; spectral domain; speech enhancement; temporal domain; Hidden Markov models; Noise; Spectrogram; Speech; Speech enhancement; Vectors; NMF; nonnegative matrix factorization; patch processing; speech enhancement; sub-band processing;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854450