DocumentCode :
766383
Title :
Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection
Author :
Gavidia-Ceballos, Liliana ; Hansen, John H L
Author_Institution :
Dept. of Biomed. Eng., Duke Univ., Durham, NC, USA
Volume :
43
Issue :
4
fYear :
1996
fDate :
4/1/1996 12:00:00 AM
Firstpage :
373
Lastpage :
383
Abstract :
The focus of this study is to formulate a speech parameter estimation algorithm for analysis/detection of vocal fold pathology. The speech processing algorithm proposed estimates features necessary to formulate a stochastic model to characterize healthy and pathology conditions from speech recordings. The general idea is to separate speech components under healthy and assumed pathology conditions. This problem is addressed using an iterative maximum-likelihood (ML) estimation procedure, based on the estimation-maximization (EM) algorithm. A new feature for characterizing pathology, termed enhanced-spectral-pathology component (ESPC), is estimated and shown to vary consistently between healthy and pathology conditions. It is also shown that the mean-area-peak-value (MAPV) and the weighted-slope (WSLOPE) indexes, which are obtained from the ESPC estimate, are meaningful measures of speech pathology conditions. For classification purposes, a five-state hidden-Markov-model (HMM) recognizer was formulated, based on the MAPV, WSLOPE, and ESPC spectral features. A set of log Mel-frequency filter bank coefficients were used to parameterize the ESPC feature. An evaluation of the HMM-based classifier was performed using speech recordings from healthy and vocal fold cancer patients of sustained vowel sounds. It is shown that while both MAPV and WSLOPE are useful features for vocal fold pathology detection, superior performance was achieved using a finer spectral representation of ESPC (e.g., a detection rate of 88.7% for pathology and 92.8% for healthy condition). One main advantage of the proposed method is that it does not require direct estimation of the glottal flow waveform. Therefore, the limitation of the inability to characterize vocal fold pathology, due to incomplete glottal closure, is no longer an issue. The results suggest that general analysis of the ESPC feature can provide a quantitative, noninvasive approach for analysis, detection, and characterization of spe- - ech production under vocal fold pathology.
Keywords :
feature extraction; hidden Markov models; iterative methods; medical signal processing; parameter estimation; speech processing; direct speech feature estimation; enhanced-spectral-pathology component; five-state hidden-Markov-model recognizer; glottal flow waveform; healthy conditions; iterative EM algorithm; log Mel-frequency filter bank coefficients; pathology conditions; quantitative noninvasive approach; speech parameter estimation algorithm; speech production characterization; stochastic model; vocal fold pathology detection; weighted-slope index; Algorithm design and analysis; Hidden Markov models; Iterative algorithms; Maximum likelihood detection; Maximum likelihood estimation; Parameter estimation; Pathology; Speech analysis; Speech processing; Stochastic processes; Algorithms; Artifacts; Fourier Analysis; Humans; Laryngeal Neoplasms; Likelihood Functions; Models, Biological; Reference Values; Speech Production Measurement; Stochastic Processes; Vocal Cords;
fLanguage :
English
Journal_Title :
Biomedical Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9294
Type :
jour
DOI :
10.1109/10.486257
Filename :
486257
Link To Document :
بازگشت