• DocumentCode
    1759282
  • Title

    An Experimental Study on Speech Enhancement Based on Deep Neural Networks

  • Author

    Yong Xu ; Jun Du ; Li-Rong Dai ; Chin-Hui Lee

  • Author_Institution
    Nat. Eng. Lab. for Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    21
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan. 2014
  • Firstpage
    65
  • Lastpage
    68
  • Abstract
    This letter presents a regression-based speech enhancement framework using deep neural networks (DNNs) with a multiple-layer deep architecture. In the DNN learning process, a large training set ensures a powerful modeling capability to estimate the complicated nonlinear mapping from observed noisy speech to desired clean signals. Acoustic context was found to improve the continuity of speech to be separated from the background noises successfully without the annoying musical artifact commonly observed in conventional speech enhancement algorithms. A series of pilot experiments were conducted under multi-condition training with more than 100 hours of simulated speech data, resulting in a good generalization capability even in mismatched testing conditions. When compared with the logarithmic minimum mean square error approach, the proposed DNN-based algorithm tends to achieve significant improvements in terms of various objective quality measures. Furthermore, in a subjective preference evaluation with 10 listeners, 76.35% of the subjects were found to prefer DNN-based enhanced speech to that obtained with other conventional technique.
  • Keywords
    learning (artificial intelligence); neural nets; speech enhancement; DNN learning process; acoustic context; deep neural networks; large training set; logarithmic minimum mean square error approach; multicondition training; multiple-layer deep architecture; nonlinear mapping; regression-based speech enhancement framework; Data models; Neural networks; Noise; Noise measurement; Speech; Speech enhancement; Training; Deep neural networks; noise reduction; regression model; speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2013.2291240
  • Filename
    6665000