• Title of article

    A perception- and PDE-based nonlinear transformation for processing spoken words

  • Author/Authors

    Qi ، نويسنده , , Yingyong and Xin، نويسنده , , Jack، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2001
  • Pages
    18
  • From page
    143
  • To page
    160
  • Abstract
    Speech signals are often produced or received in the presence of noise, which is known to degrade the performance of a speech recognition system. In this paper, a perception- and PDE-based nonlinear transformation was developed to process spoken words in noisy environment. Our goal is to distinguish essential speech features and suppress noise so that the processed words are better recognized by a computer software. The nonlinear transformation was made on the spectrogram (short-term Fourier spectra) of speech signals, which reveals the signal energy distribution in time and frequency. The transformation reduces noise through time adaptation (reducing temporally slowly varying portions of spectra) and enhances spectral peaks (formants) by evolving a focusing quadratic fourth-order PDE. Short-term spectra of speech signals were initially divided into three (low, mid and high) frequency bands based on the critical bandwidth of human audition. An algorithm was developed to trace the upper and lower intensity envelopes of signal in each band. The difference between the upper and lower envelopes reflects the signal-to-noise (SNR) ratio of each band. Constant, low SNR signals in each band were adaptively decreased to reduce noise. Then evolution of the focusing PDE was used to enhance the spectral peaks, and further reduce noise interference. Numerical results on noisy spoken words indicated that the transformed spectral pattern of the spoken words was insensitive to noise for SNR ranging from 0 to 20 dB (decibel). The spectral distances between noisy words and original words decreased after the transformation. A numerical experiment was performed on 11 spoken words at SNR=5 dB. A noisy word is recognized numerically by computing the closest L2 spectral distance from the clean template. The experiment reached a recognition rate as high as 100%. Analyses on the properties of the transformation are provided.
  • Keywords
    Spoken words processing , Nonlinear transformation , Noise , Perception and PDE
  • Journal title
    Physica D Nonlinear Phenomena
  • Serial Year
    2001
  • Journal title
    Physica D Nonlinear Phenomena
  • Record number

    1727140