DocumentCode
667539
Title
Sparse representation and epoch estimation of voiced speech
Author
Gunther, Jacob ; Moon, Thomas
Author_Institution
Dept. of Electr. & Comput. Eng., Utah State Univ., Logan, UT, USA
fYear
2013
fDate
20-23 Oct. 2013
Firstpage
1
Lastpage
4
Abstract
Whereas most approaches to linear speech prediction fail to account for the quasi-periodic glottal flow, this paper incorporates a model for the glottal flow derivative (GFD) directly into the linear prediction problem. A linear model for the prediction error is obtained by constructing a dictionary of time-shifted GFD pulses. The pulses are constructed by applying glottal inverse filtering (GIF) to recorded speech. Minimizing the difference between the linear prediction residual and a sparse combination of the pulses in the dictionary leads to joint estimation of the linear predictor as well as a sparse representation for the prediction error that reveals the instants of vocal tract excitation (epochs). The method is applied to voiced segments extracted from the CMU Arctic dataset which also includes electro-glottograms. Results show that the proposed method is effective in estimating the parameters of interest and that GIF-based pulses more accurately model GFD pulses occurring in real speech than pulses computed using the mathematical models.
Keywords
estimation theory; filtering theory; signal representation; speech processing; CMU Arctic dataset; GIF; electro-glottograms; epoch estimation; glottal flow derivative; glottal inverse filtering; linear prediction problem; linear prediction residual; linear predictor; linear speech prediction; prediction error; quasiperiodic glottal flow; sparse combination; sparse representation; time-shifted GFD pulses; vocal tract excitation; voiced segments; voiced speech; Acoustics; Dictionaries; Mathematical model; Speech; Speech coding; Speech processing; Linear speech prediction; epoch detection; sparse signal recovery;
fLanguage
English
Publisher
ieee
Conference_Titel
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop on
Conference_Location
New Paltz, NY
ISSN
1931-1168
Type
conf
DOI
10.1109/WASPAA.2013.6701885
Filename
6701885
Link To Document