• DocumentCode
    3343626
  • Title

    Wideband speech and audio coding using gammatone filter banks

  • Author

    Ambikairajah, E. ; Epps, Julien ; Lin, Lee

  • Author_Institution
    Sch. of Electr. Eng. & Telecommun., New South Wales Univ., Sydney, NSW, Australia
  • Volume
    2
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    773
  • Abstract
    Considerable research attention has been directed towards speech and audio coding algorithms capable of producing high quality coded speech and audio, however few of these use signal representations which account for temporal as well as spectral detail. This paper presents a new technique for 16 kHz wideband speech and audio coding, whereby analysis and synthesis are performed using a linear phase gammatone filter bank. The outputs of these critical band filters are processed to obtain a series of pulse trains that represent neural firing. Auditory masking is then applied to reduce the number of pulses, producing a more compact time-frequency parameterization. The critical band gains and pulse amplitudes and positions are then coded using a combination of non-uniform quantization, arithmetic coding and vector quantization. This coding paradigm produces high quality coded speech and audio, is based upon well-known models of the auditory system, is highly scalable, and has moderate complexity
  • Keywords
    arithmetic codes; audio coding; channel bank filters; hearing; linear phase filters; signal representation; speech coding; speech intelligibility; speech synthesis; time-frequency analysis; vector quantisation; 16 kHz; arithmetic coding; audio analysis; audio coding algorithms; audio quality; audio synthesis; auditory masking; auditory system models; coding paradigm; critical band filters; critical band gain; linear phase gammatone filter bank; neural firing; nonuniform quantization; pulse amplitude; pulse position; pulse trains; signal representation; speech analysis; speech coding algorithms; speech quality; speech synthesis; time-frequency parameterization; vector quantization; wideband audio coding; wideband speech coding; Audio coding; Filter bank; Performance analysis; Signal representations; Signal synthesis; Speech analysis; Speech coding; Speech synthesis; Time frequency analysis; Wideband;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.941029
  • Filename
    941029