DocumentCode :
1548022
Title :
Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model
Author :
George, E. Bryan ; Smith, Mark J T
Author_Institution :
Signal Process. Center of Technol., Lockheed-Martin Inc., Nashua, NH, USA
Volume :
5
Issue :
5
fYear :
1997
fDate :
9/1/1997 12:00:00 AM
Firstpage :
389
Lastpage :
406
Abstract :
Sinusoidal modeling has been successfully applied to a broad range of speech processing problems, and offers advantages over linear predictive modeling and the short-time Fourier transform for speech analysis/synthesis and modification. This paper presents a novel speech analysis/synthesis system based on the combination of an overlap-add sinusoidal model with an analysis-by-synthesis technique to determine the model parameters. It describes this analysis procedure in detail, and introduces an equivalent frequency-domain algorithm that takes advantage of the computational efficiency of the fast Fourier transform (FFT). In addition, a refined overlap-add sinusoidal model capable of shape-invariant speech modification is derived, and a pitch-scale modification algorithm is defined that preserves speech bandwidth and eliminates noise migration effects. Analysis-by-synthesis achieves very high synthetic speech quality by accurately estimating the component frequencies, eliminating sidelobe interference effects, and effectively dealing with nonstationary speech events. The refined overlap-add synthesis model correlates well with analysis-by-synthesis, and modifies speech without objectionable artifacts by explicitly controlling shape invariance and phase coherence. The proposed analysis-by-synthesis/overlap-add (ABS/OLA) system allows for both fixed and time-varying time-, frequency-, and pitch-scale modifications, and computational shortcuts using the FFT algorithm make its implementation feasible using currently available hardware
Keywords :
correlation methods; fast Fourier transforms; frequency estimation; speech intelligibility; speech processing; speech synthesis; FFT algorithm; analysis by synthesis model; computational efficiency; correlation; fast Fourier transform; frequency estimation; frequency-domain algorithm; model parameters; nonstationary speech events; overlap-add sinusoidal model; overlap-add synthesis model; phase coherence; pitch scale modification algorithm; shape invariant speech modification; sidelobe interference effects; sinusoidal modeling; speech analysis/synthesis system; speech bandwidth; speech processing; synthetic speech quality; time varying frequency scale modification; time varying time scale modification; Algorithm design and analysis; Fourier transforms; Frequency domain analysis; Frequency estimation; Predictive models; Shape control; Speech analysis; Speech enhancement; Speech processing; Speech synthesis;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.622558
Filename :
622558
Link To Document :
بازگشت