DocumentCode :
542234
Title :
A phase generation method for speech reconstruction from spectral envelope and pitch intervals
Author :
Kang, Hong-Goo ; Kim, Hong Kook
Author_Institution :
AT&T Labs-Research, 180 Park Avenue, Florham Park, NJ 07932, USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
In this paper, we propose a new speech reconstruction method from spectral envelope and pitch intervals, which is applicable to the network side of a distributed speech recognition system as a play-back function. The spectral envelope of speech is represented as a set of mel-frequency cepstral coefficients that is a well-known recognition parameter. First, a sinusoidal synthesis with a zero-phase model is used to obtain a pitch-based waveform. To enhance the naturalness of the speech we replace the zero phase information with pre-stored linear and random codebooks. The ultimate phase information is determined depending on the energy ratio between linear and random components. Unlike the classic low bit-rate speech coding, however, the energy ratio is estimated in the decoding stage from a time-frequency filter applied to the pitch-based synthesized signal. Thus, the phase information is not a feature parameter from the encoder side. The proposed phase generation method uses the knowledge that pitch variation is a main cause of the mixed characteristics in speech signals. An informal listening test verifies that the quality of the proposed method is much better than that of the synthetic quality.
Keywords :
Encoding; Filter banks; Gold; Harmonic analysis; Power harmonic filters; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743746
Filename :
5743746
Link To Document :
بازگشت