Title :
Speech data compression through sparse coding of innovations
Author :
Ramabadran, Tenkasi V. ; Sinha, Deepen
Author_Institution :
Dept. of Electr. Eng. & Comput. Eng., Iowa State Univ., Ames, IA, USA
fDate :
4/1/1994 12:00:00 AM
Abstract :
A new scheme for coding speech at low bit rates (4.8-16 kb/s) but still maintaining high quality is described. Speech is regarded as a piecewise-stationary random signal and its synthesis is accomplished by means of a Kalman estimator at the decoder. The Kalman estimator requires for its operation a signal model and a sequence of measurements of the states of the model. A two-stage, time-varying, all-pole filter excited by white noise is used as the speech signal model. Linear combinations of speech samples taken at sparse but periodic intervals and provided in the form of innovations serve as measurements. The role of the encoder in the proposed scheme is seen as that of extracting the signal model parameters as well as forming the measurements and transmitting this information to the decoder. An optimum measurement strategy is developed for the estimator. A procedure for shaping the error spectrum of the synthesized speech is also described. Simulation studies show that coders based on the proposed scheme can provide high-quality speech at low bit rates. Important implementation details of such coders as well as their performance results for different choices of coder parameters are given
Keywords :
Kalman filters; linear predictive coding; random processes; speech coding; speech intelligibility; stochastic processes; time-varying networks; white noise; 4.8 to 16 kbit/s; Kalman estimator; decoder; encoder; error spectrum shaping; innovations; low bit rates; measurement; piecewise-stationary random signal; signal model parameters; simulation; sparse coding; speech data compression; speech quality; speech samples; speech signal model; speech synthesis; time-varying all-pole filter; white noise; Bit rate; Data compression; Decoding; Kalman filters; Signal synthesis; Speech coding; Speech synthesis; State estimation; Technological innovation; White noise;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on