• DocumentCode
    3421740
  • Title

    Long-term flexible 2D cepstral modeling of speech spectral amplitudes

  • Author

    Firouzmand, Mohammad ; Girin, Laurent

  • Author_Institution
    Grenoble Lab. of Images, Grenoble
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    3937
  • Lastpage
    3940
  • Abstract
    This paper presents a method for modeling the envelope of spectral amplitude parameters of speech signals in "two dimensions" (2D). It consists of two cascaded modelings: the first one along the frequency axis is the usual cepstrum technique, which consists of modeling the log-scaled spectral envelope with a discrete cosine model (DCM). The second one, along the time axis, consists of modeling the trajectory of the envelope DCM coefficients by another similar DCM model. An iterative algorithm is proposed to optimally fit this 2D-model to the data according to a perceptual criterion based on frequency masking. This approach is shown to provide an efficient and flexible representation of spectral amplitude parameters in terms of coefficient rates, while providing good signal quality, opening new perspectives in very-low bit-rate sinusoidal speech coding.
  • Keywords
    discrete cosine transforms; iterative methods; speech processing; cascaded modelings; discrete cosine model; frequency masking; iterative algorithm; log-scaled spectral envelope; long-term flexible 2D cepstral modeling; perceptual criterion; sinusoidal speech coding; speech signals; speech spectral amplitudes; Cepstral analysis; Speech; speech analysis; speech coding; speech modeling; speech processing; speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518515
  • Filename
    4518515