Automatic detection of voice impairments due to vocal misuse by means of Gaussian mixture models

Author

Godino-Llorente, Juan I. ; Aguilera-Navarro, Santiago ; Gomez-Vilda, Pedro

Author_Institution

Lab. de Tecnologia de Rehabilitacion, Univ. Politecnica de Madrid, Spain

Volume

2

fYear

2001

fDate

2001

Firstpage

1723

Abstract

There is an increasing risk of vocal and voice diseases due to the modern way of life. It is well known that most of the vocal and voice diseases cause changes in the acoustic voice signal. These diseases have to be diagnosed and treated at an early stage. Acoustic analysis is a non-invasive technique based on digital processing of speech signal. Acoustic analysis could be a useful tool to diagnose this kind of diseases, furthermore it presents several advantages: it is a non-invasive tool, provides an objective diagnostic, moreover, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. ENT clinicians use acoustic voice analysis to characterise pathological voices. In this paper, we study a well known classification approach-in speaker recognition and identification-applied to the automatic detection of voice disorders. Former and actual works demonstrate that impaired voice detection can be carried out by means of supervised neural nets: multilayer perceptron. We have focused our task in detection of impaired voices by means of Gaussian mixture models and parameters such as mel frequency cepstral coefficients extracted from the windowed voice signal.

Keywords

Gaussian distribution; cepstral analysis; fast Fourier transforms; feature extraction; learning (artificial intelligence); maximum likelihood estimation; medical signal processing; multilayer perceptrons; speech; speech processing; Gaussian mixture models; acoustic voice analysis; automatic detection; feature extraction; learning algorithm; likelihood ratio test; mel frequency coefficients; multilayer perceptron; parameter representation; pathological voices; probability of correct detection; short-time FFT; speaker identification; speaker recognition; supervised neural nets; temporal order; vocal diseases; vocal misuse; voice impairments; windowed voice signal; Acoustic signal detection; Diseases; Neural networks; Pathology; Signal analysis; Signal processing; Speaker recognition; Speech analysis; Speech processing; Surgery;

fLanguage

English

Publisher

ieee

Conference_Titel

Engineering in Medicine and Biology Society, 2001. Proceedings of the 23rd Annual International Conference of the IEEE

ISSN

1094-687X

Print_ISBN

0-7803-7211-5

Type

conf

DOI

10.1109/IEMBS.2001.1020549

Filename

1020549