• DocumentCode
    454600
  • Title

    Speech Bandwidth Enhancement using State Space Speech Dynamics

  • Author

    Yao, Sheng ; Chan, Cheung-Fat

  • Author_Institution
    Dept. of Electron. Eng., City Univ. of Hong Kong
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    Extending narrowband speech (0-4 kHz) to wideband speech (0-8 kHz) has applications in telephone systems and speech recognition systems where wideband training speech data may not be available. A couple of methods have been proposed to retrieve the missing high-band information (4-8 kHz) from narrowband speech. Memoryless systems are likely to produce large hissing artifacts since mutual information between low-band (0-4 kHz) and high-band (4-8 kHz) spectra are actually quite low. Generally speaking, bandwidth extension cannot recover original high-band information but good approximates with less over-estimation of the high-band energy, which usually refers to hissing artifact, can be obtained by considering the neighboring speech frames. In this paper, we propose a new bandwidth extension system with memory by using a state-space model to capture the long-term speech dynamics. The model parameters can be trained in the sense of maximum likelihood (ML) and the enhancement is obtained via wideband state vector estimation and Kalman filtering. The performance in terms of spectral distortion is shown to be much better than other memoryless systems and is comparable with early continuous density hidden Markov model (CDHMM) memory system. The new state-space method is inherent sequential and has advantages of less processing delays and robustness against block detection errors
  • Keywords
    Kalman filters; hidden Markov models; maximum likelihood estimation; speech enhancement; speech recognition; state-space methods; 0 to 8 kHz; Kalman filtering; ML; bandwidth extension system; block detection errors; continuous density hidden Markov model memory system; hissing artifact; long-term speech dynamics; maximum likelihood; narrowband speech; spectral distortion; speech bandwidth enhancement; speech recognition systems; state space speech dynamics; telephone systems; wideband speech; wideband state vector estimation; Bandwidth; Hidden Markov models; Maximum likelihood estimation; Memoryless systems; Narrowband; Speech enhancement; Speech recognition; State-space methods; Telephony; Wideband;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660064
  • Filename
    1660064