DocumentCode :
3014343
Title :
"Multirate sinusoidal transform coding at rates from 2.4 kbps to 8 kbps"
Author :
McAulay, Robert J. ; Quatieri, Thomas F.
Author_Institution :
Massachusettes Institute of Technology, Lexington, Massachusetts
Volume :
12
fYear :
1987
fDate :
31868
Firstpage :
1645
Lastpage :
1648
Abstract :
It has been shown [1] that an analysis/synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially indistinguishable from the original. By exploiting the peak-to-peak correlation of the sine-wave amplitudes [2], a harmonic model for the sine-wave frequencies, and a predictive model for the sine-wave phases [3], it has also been shown that the sine-wave parameters can be coded at 8 kbps. In this paper a new technique is described for coding the sine-wave amplitudes based on the idea of a pitch-adaptive channel vocoder. Using this amplitude-coding strategy and operating at a total bit rate of 4.8 kbps, it was possible to code and transmit enough phase information so that very intelligible, natural sounding speech could be synthesized. This 4.8 kbps system has been implemented in real-time and has achieved a Diagnostic Rhyme Test (DRT) score of 95. At 2.4 kbps no explicit phase information could be coded, but by phase-locking all of the sine waves to the fundamental, by adding a pitch-adaptive quadratic phase, and by adding a voicing dependent random phase to each sine wave, natural sounding synthetic speech could be obtained. This new system is currently being implemented in real-time so that intelligibility tests can be performed.
Keywords :
Bit rate; Frequency; Predictive models; Real time systems; Speech analysis; Speech coding; Speech synthesis; System testing; Transform coding; Vocoders;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
Type :
conf
DOI :
10.1109/ICASSP.1987.1169536
Filename :
1169536
Link To Document :
بازگشت