مرکز منطقه ای اطلاع رساني علوم و فناوري - Sparse representation and epoch estimation of voiced speech

DocumentCode :

667539

Title :

Sparse representation and epoch estimation of voiced speech

Author :

Gunther, Jacob ; Moon, Thomas

Author_Institution :

Dept. of Electr. & Comput. Eng., Utah State Univ., Logan, UT, USA

fYear :

2013

fDate :

20-23 Oct. 2013

Firstpage :

Lastpage :

Abstract :

Whereas most approaches to linear speech prediction fail to account for the quasi-periodic glottal flow, this paper incorporates a model for the glottal flow derivative (GFD) directly into the linear prediction problem. A linear model for the prediction error is obtained by constructing a dictionary of time-shifted GFD pulses. The pulses are constructed by applying glottal inverse filtering (GIF) to recorded speech. Minimizing the difference between the linear prediction residual and a sparse combination of the pulses in the dictionary leads to joint estimation of the linear predictor as well as a sparse representation for the prediction error that reveals the instants of vocal tract excitation (epochs). The method is applied to voiced segments extracted from the CMU Arctic dataset which also includes electro-glottograms. Results show that the proposed method is effective in estimating the parameters of interest and that GIF-based pulses more accurately model GFD pulses occurring in real speech than pulses computed using the mathematical models.

Keywords :

estimation theory; filtering theory; signal representation; speech processing; CMU Arctic dataset; GIF; electro-glottograms; epoch estimation; glottal flow derivative; glottal inverse filtering; linear prediction problem; linear prediction residual; linear predictor; linear speech prediction; prediction error; quasiperiodic glottal flow; sparse combination; sparse representation; time-shifted GFD pulses; vocal tract excitation; voiced segments; voiced speech; Acoustics; Dictionaries; Mathematical model; Speech; Speech coding; Speech processing; Linear speech prediction; epoch detection; sparse signal recovery;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop on

Conference_Location :

New Paltz, NY

ISSN :

1931-1168

Type :

conf

DOI :

10.1109/WASPAA.2013.6701885

Filename :

6701885

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=667539