Title :
Sparse representation and epoch estimation of voiced speech
Author :
Gunther, Jacob ; Moon, Thomas
Author_Institution :
Dept. of Electr. & Comput. Eng., Utah State Univ., Logan, UT, USA
Abstract :
Whereas most approaches to linear speech prediction fail to account for the quasi-periodic glottal flow, this paper incorporates a model for the glottal flow derivative (GFD) directly into the linear prediction problem. A linear model for the prediction error is obtained by constructing a dictionary of time-shifted GFD pulses. The pulses are constructed by applying glottal inverse filtering (GIF) to recorded speech. Minimizing the difference between the linear prediction residual and a sparse combination of the pulses in the dictionary leads to joint estimation of the linear predictor as well as a sparse representation for the prediction error that reveals the instants of vocal tract excitation (epochs). The method is applied to voiced segments extracted from the CMU Arctic dataset which also includes electro-glottograms. Results show that the proposed method is effective in estimating the parameters of interest and that GIF-based pulses more accurately model GFD pulses occurring in real speech than pulses computed using the mathematical models.
Keywords :
estimation theory; filtering theory; signal representation; speech processing; CMU Arctic dataset; GIF; electro-glottograms; epoch estimation; glottal flow derivative; glottal inverse filtering; linear prediction problem; linear prediction residual; linear predictor; linear speech prediction; prediction error; quasiperiodic glottal flow; sparse combination; sparse representation; time-shifted GFD pulses; vocal tract excitation; voiced segments; voiced speech; Acoustics; Dictionaries; Mathematical model; Speech; Speech coding; Speech processing; Linear speech prediction; epoch detection; sparse signal recovery;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop on
Conference_Location :
New Paltz, NY
DOI :
10.1109/WASPAA.2013.6701885