• DocumentCode
    3179577
  • Title

    Ultra low bit-rate speech coding: An overview and recent results

  • Author

    Ramasubramanian, V.

  • Author_Institution
    Siemens Corp. Res. & Technol. - India, Bangalore, India
  • fYear
    2012
  • fDate
    22-25 July 2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In narrow-band speech coding, specifically in the low and ultra low bit-rate ranges, a series of efficient quantization of the LP parameters using fixed-length as well as variable-length segment quantization (VLSQ) have resulted in a progressive reduction in the bit-rate from the 2400 bits/sec baseline of the LPC-10 coder down to 300 bits/sec and less. The VLSQ framework forms a generic basis of a class of segment vocoders within which various types of segments/units and unit-modeling have been explored, such as phones (in the phonetic vocoder), automatically derived units (phones and diphones), R/D optimal linear prediction, HMM based recognition-synthesis and unit-selection based paradigms. Recently, set within the original unit-selection framework, we proposed a joint spectral-residual quantization scheme which obviates the need for transmitting any side information about the residual of the input speech, offering up to 2dB spectral distortion at 250 bits/sec. In this paper, in order to realize better rate-distortion performance, we propose joint spectral-residual quantization in an optimal unit-selection framework based on a modified one-pass dynamic programming (DP) algorithm.
  • Keywords
    distortion; hidden Markov models; quantisation (signal); speech coding; HMM based recognition-synthesis; LP parameters; LPC-10 coder; R/D optimal linear prediction; VLSQ framework; automatically derived units; efficient quantization; one-pass dynamic programming algorithm; progressive reduction; rate-distortion performance; ultra low bit-rate speech coding; unit-selection based paradigms; variable-length segment quantization; Decoding; Distortion measurement; Joints; Quantization; Speech; Speech coding; Vocoders; Speech coding; no residual transmission; one-pass DP; optimal unit-selection; ultra low bit-rate;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communications (SPCOM), 2012 International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    978-1-4673-2013-9
  • Type

    conf

  • DOI
    10.1109/SPCOM.2012.6290246
  • Filename
    6290246