• DocumentCode
    2892338
  • Title

    Decomposition of a bandpass signal and its applications to speech processing

  • Author

    Kumaresan, Ramdas ; Allu, Gopi Krishna ; Swaminathan, Jayaganesh ; Wang, Yadong

  • Author_Institution
    Dept. of Electr. Eng., Rhode Island Univ., Kingston, RI, USA
  • Volume
    2
  • fYear
    2003
  • fDate
    9-12 Nov. 2003
  • Firstpage
    2078
  • Abstract
    We have developed a novel approach to speech feature extraction based on a modulation model of a band-pass signal. Speech is processed by a bank of band-pass filters. At the output of the band-pass filters the signal is subjected to a log-derivative operation which naturally decomposes the band-pass signal into analytic (called α˙(t)+jα˙ˆ(t)) and antianalytic (called β˙(t)-jβ˙ˆ(t)) components. The average instantaneous frequency (AIF) and average log-envelope (ALE) are then extracted as coarse features at the output of each filter. We indicate how further refined features may also be extracted from the analytic and antianalytic components. We then evaluated the feature extraction procedure on the Aurora 2 task where noise corruption is synthetic. For clean training, (compared to the mel-cepstrum front end, with 5 mixture HMM back-end) our AIF/ALE front end achieves an average improvement of 13.97% with set A and 17.92% improvement with set B and -31.72% (negative) ´improvement´ with set C. The overall improvement in accuracy rates for clean training is 7.97%. Although the improvements are modest, the novelty of the front-end and its potential for future enhancements are our strengths.
  • Keywords
    band-pass filters; feature extraction; modulation; speech processing; Aurora 2 task; analytic-antianalytic component; average instantaneous frequency; average log-envelope; band-pass filter bank; bandpass signal decomposition; log-derivative operation; modulation model; speech feature extraction; speech processing; Auditory system; Band pass filters; Feature extraction; Filter bank; Frequency; Signal processing; Speech analysis; Speech enhancement; Speech processing; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on
  • Print_ISBN
    0-7803-8104-1
  • Type

    conf

  • DOI
    10.1109/ACSSC.2003.1292346
  • Filename
    1292346