DocumentCode
3421740
Title
Long-term flexible 2D cepstral modeling of speech spectral amplitudes
Author
Firouzmand, Mohammad ; Girin, Laurent
Author_Institution
Grenoble Lab. of Images, Grenoble
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
3937
Lastpage
3940
Abstract
This paper presents a method for modeling the envelope of spectral amplitude parameters of speech signals in "two dimensions" (2D). It consists of two cascaded modelings: the first one along the frequency axis is the usual cepstrum technique, which consists of modeling the log-scaled spectral envelope with a discrete cosine model (DCM). The second one, along the time axis, consists of modeling the trajectory of the envelope DCM coefficients by another similar DCM model. An iterative algorithm is proposed to optimally fit this 2D-model to the data according to a perceptual criterion based on frequency masking. This approach is shown to provide an efficient and flexible representation of spectral amplitude parameters in terms of coefficient rates, while providing good signal quality, opening new perspectives in very-low bit-rate sinusoidal speech coding.
Keywords
discrete cosine transforms; iterative methods; speech processing; cascaded modelings; discrete cosine model; frequency masking; iterative algorithm; log-scaled spectral envelope; long-term flexible 2D cepstral modeling; perceptual criterion; sinusoidal speech coding; speech signals; speech spectral amplitudes; Cepstral analysis; Speech; speech analysis; speech coding; speech modeling; speech processing; speech synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518515
Filename
4518515
Link To Document