DocumentCode
290845
Title
Non-linear prototype waveform interpolation for voiced speech encoding
Author
Li, H. ; Lockhart, G.B.
Author_Institution
Leeds Univ., UK
fYear
1995
fDate
26-29 Mar 1995
Firstpage
220
Lastpage
224
Abstract
Prototype waveform interpolation (PWI) is a practical and promising coding technique applicable to voiced speech. The waveform and duration of only one pitch cycle (the prototype) per frame is extracted and coded using LPC techniques. Segments of missing speech between the prototypes are reconstructed at the receiver by interpolation from the decoded prototype waveforms. Although waveform reconstruction may not be very accurate over the interpolated segments, suprisingly good speech quality can be achieved at bit rates in the region of 2.5 to 3.5 kb/s using frames of about 20 ms duration, provided the prototype waveforms and pitch periodicity are satisfactorily reproduced. For reasons of low complexity and bit rate, most reported work on PWI uses linear interpolation methods with linear interpolation functions, but these suffer from inherent difficulties in reproducing non-uniform variations in pitch cycle waveforms and lengths. It is shown that nonlinear techniques can improve the representation of voiced speech in interpolated segments without significantly increasing bit rates. Pitch structure is improved by using a temporal differential rate codebook for transmission of small differences in the duration of pitch cycles. Waveform fidelity is improved by deriving optimal combination coefficients (OCC) which determine the composition of each pitch cycle waveform in terms of the given prototypes at segment boundaries. The OCC vectors allow for nonlinear variation in waveform composition and are vector quantised for transmission
Keywords
interpolation; speech coding; vector quantisation; 2.5 to 3.5 kbit/s; bit rates; decoded prototype waveforms; linear predictive coding; missing speech; nonlinear prototype waveform interpolation; pitch cycle; pitch structure; speech quality; temporal differential rate codebook; vector quantisation; voiced speech encoding; waveform fidelity; waveform reconstruction;
fLanguage
English
Publisher
iet
Conference_Titel
Telecommunications, 1995. Fifth IEE Conference on
Conference_Location
Brighton
Print_ISBN
0-85296-634-2
Type
conf
DOI
10.1049/cp:19950145
Filename
396097
Link To Document