Title :
Joint linear prediction and epoch estimation of voiced speech using a basis where the prediction residual can be sparsely represented
Author :
Gunther, Jacob ; Moon, Thomas
Author_Institution :
Dept. of Electr. & Comput. Eng., Utah State Univ., Logan, UT, USA
Abstract :
Whereas most approaches to linear speech prediction fail to account for the quasi-periodic glottal flow, this paper incorporates the Liljencrants-Fant model for glottal flow derivative (GFD) directly into the linear prediction problem. A linear model for the prediction error is obtained by constructing a dictionary of time-shifted GFD pulses. Minimizing the difference between the linear prediction residual and a sparse combination of the pulses in the dictionary leads to joint estimation of the linear predictor as well as a sparse representation for the prediction error that reveals the instants of vocal tract excitation (epochs). A greedy algorithm is proposed to approximately solve this joint estimation problem. The method is applied to voiced segments extracted from the CMU Arctic dataset which also includes electro-glottograms. Results show that the approach and the proposed algorithm are effective in estimating the parameters of interest.
Keywords :
signal representation; speech processing; CMU Arctic dataset; GFD; Liljencrants-Fant model; electro-glottograms; epoch estimation; glottal flow derivative; joint linear prediction; linear prediction error model; linear speech prediction; prediction residual; quasiperiodic glottal flow; sparse speech representation; time-shifted GFD pulses dictionary; vocal tract excitation; voiced segments; voiced speech; Dictionaries; Greedy algorithms; Predictive models; Shape; Speech; Speech coding; Vectors; Linear speech prediction; epoch detection; sparse signal recovery;
Conference_Titel :
Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE), 2013 IEEE
Conference_Location :
Napa, CA
Print_ISBN :
978-1-4799-1614-6
DOI :
10.1109/DSP-SPE.2013.6642592