Title :
Spectral excitation coding of speech at 2.4 kb/s
Author :
Cuperman, V. ; Lupini, P. ; Bhattacharya, B.
Author_Institution :
Sch. of Eng. Sci., Simon Fraser Univ., Burnaby, BC, Canada
Abstract :
We present spectral excitation coding (SEC), a speech codec based on a sinusoidal model applied to the excitation signal. A phase dispersion algorithm allows the same model to be used for voiced as well as unvoiced and transitional sounds. The phase dispersion algorithm significantly improves the perceived quality resulting in more natural reconstructed speech. A new technique for variable dimension vector quantization called nonsquare transform vector quantization (NSTVQ) is used for quantization of the harmonic magnitudes. The SEC system at 2.45 kb/s achieved an MOS score 0.8 points higher than the 2.4 kb/s ZPC-10 standard. A preliminary 1.85 kb/s SEC system which uses zero-bit magnitude quantization is also presented. Informal listening tests indicate that the quality of the 1.85 kb/s system exceeds that of the LPC-10 standard
Keywords :
signal reconstruction; spectral analysis; speech codecs; speech coding; speech intelligibility; vector quantisation; 2.4 kbit/s; LPC-10 standard; MOS score; harmonic magnitudes quantization; informal listening tests; natural reconstructed speech; nonsquare transform vector quantization; perceived speech quality; phase dispersion algorithm; sinusoidal model; spectral excitation coding; speech codec; speech coding; transitional sounds; unvoiced sounds; variable dimension vector quantization; zero-bit magnitude quantization; Codecs; Decoding; Encoding; Frequency; Linear predictive coding; Nonlinear filters; Signal synthesis; Speech codecs; Speech coding; Speech synthesis; System testing; Vector quantization;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479637