• DocumentCode
    3321218
  • Title

    An embedded stereo speech and audio coding method based on principal component analysis

  • Author

    Jia, Mao-shen ; Bao, Chang-chun ; Liu, Xin ; Li, Xiao-ming ; Li, Ru-wei

  • Author_Institution
    Speech & Audio Signal Process. Lab., Beijing Univ. of Technol., Beijing, China
  • fYear
    2011
  • fDate
    14-17 Dec. 2011
  • Firstpage
    321
  • Lastpage
    325
  • Abstract
    In this paper a compressive sampling method of MLT coefficients which is used for extracting stereo information is adopted based on principal component analysis (PCA) and Modulated Lapped Transform (MLT). With this method, an embedded variable bit-rates stereo speech and audio coding algorithm is proposed in this paper. In this codec, the stereo signal sampled at 32 kHz and 16 kHz can be coded in terms of scalable bit rates, the structure of bit-stream is embedded and the bit-stream can be divided into several layers. The core codec is ITUT G. 729.1 which can process mono signal with 7 kHz bandwidth. Besides there are four extra bit-rates added include 40, 48, 56, and 64kb/s.The maximum bit-rates of wideband stereo signal and super-wideband stereo signal are 48kb/s and 64kb/s, respectively. The objective and subjective test results show that the quality of the proposed codec is no worse than the reference codec which is requested by ITU-T.
  • Keywords
    audio coding; codecs; feature extraction; modulation coding; principal component analysis; signal sampling; speech coding; transforms; ITUT G. 729.1; audio coding method; bandwidth 7 kHz; bit rate 40 kbit/s; bit rate 48 kbit/s; bit rate 56 kbit/s; bit rate 64 kbit/s; codec; compressive sampling method; embedded stereo speech coding; embedded variable bit-rates stereo speech coding; frequency 16 kHz; frequency 32 kHz; modulated lapped transform; mono signal processing; principal component analysis; stereo information extraction; stereo signal sampling; super-wideband stereo signal; Decoding; Speech; Speech coding; Speech processing; Vectors; Wideband; audio coding; embedded coding; speech coding; stereo coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Information Technology (ISSPIT), 2011 IEEE International Symposium on
  • Conference_Location
    Bilbao
  • Print_ISBN
    978-1-4673-0752-9
  • Electronic_ISBN
    978-1-4673-0751-2
  • Type

    conf

  • DOI
    10.1109/ISSPIT.2011.6151581
  • Filename
    6151581