• DocumentCode
    179189
  • Title

    Speech enhancement using segmental nonnegative matrix factorization

  • Author

    Hao-teng Fan ; Jeih-weih Hung ; Xugang Lu ; Syu-Siang Wang ; Yu Tsao

  • Author_Institution
    Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    4483
  • Lastpage
    4487
  • Abstract
    The conventional NMF-based speech enhancement algorithm analyzes the magnitude spectrograms of both clean speech and noise in the training data via NMF and estimates a set of spectral basis vectors. These basis vectors are used to span a space to approximate the magnitude spectrogram of the noise-corrupted testing utterances. Finally, the components associated with the clean-speech spectral basis vectors are used to construct the updated magnitude spectrogram, producing an enhanced speech utterance. Considering that the rich spectral-temporal structure may be explored in local frequency and time-varying spectral patches, this study proposes a segmental NMF (SNMF) speech enhancement scheme to improve the conventional frame-wise NMF-based method. Two algorithms are derived to decompose the original nonnegative matrix associated with the magnitude spectrogram; the first algorithm is used in the spectral domain and the second algorithm is used in the temporal domain. When using the decomposition processes, noisy speech signals can be modeled more precisely, and spectrograms regarding the speech part can be constituted more favorably compared with using the conventional NMF-based method. Objective evaluations using perceptual evaluation of speech quality (PESQ) indicate that the proposed SNMF strategy increases the sound quality in noise conditions and outperforms the well-known MMSE log-spectral amplitude (LSA) estimation.
  • Keywords
    least mean squares methods; matrix decomposition; speech enhancement; MMSE log spectral amplitude estimation; magnitude spectrograms; noisy speech signals; perceptual evaluation of speech quality; segmental nonnegative matrix factorization; spectral basis vectors; spectral domain; speech enhancement; temporal domain; Hidden Markov models; Noise; Spectrogram; Speech; Speech enhancement; Vectors; NMF; nonnegative matrix factorization; patch processing; speech enhancement; sub-band processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854450
  • Filename
    6854450