• DocumentCode
    1995127
  • Title

    Automatic detection of voice impairments due to vocal misuse by means of Gaussian mixture models

  • Author

    Godino-Llorente, Juan I. ; Aguilera-Navarro, Santiago ; Gomez-Vilda, Pedro

  • Author_Institution
    Lab. de Tecnologia de Rehabilitacion, Univ. Politecnica de Madrid, Spain
  • Volume
    2
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    1723
  • Abstract
    There is an increasing risk of vocal and voice diseases due to the modern way of life. It is well known that most of the vocal and voice diseases cause changes in the acoustic voice signal. These diseases have to be diagnosed and treated at an early stage. Acoustic analysis is a non-invasive technique based on digital processing of speech signal. Acoustic analysis could be a useful tool to diagnose this kind of diseases, furthermore it presents several advantages: it is a non-invasive tool, provides an objective diagnostic, moreover, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. ENT clinicians use acoustic voice analysis to characterise pathological voices. In this paper, we study a well known classification approach-in speaker recognition and identification-applied to the automatic detection of voice disorders. Former and actual works demonstrate that impaired voice detection can be carried out by means of supervised neural nets: multilayer perceptron. We have focused our task in detection of impaired voices by means of Gaussian mixture models and parameters such as mel frequency cepstral coefficients extracted from the windowed voice signal.
  • Keywords
    Gaussian distribution; cepstral analysis; fast Fourier transforms; feature extraction; learning (artificial intelligence); maximum likelihood estimation; medical signal processing; multilayer perceptrons; speech; speech processing; Gaussian mixture models; acoustic voice analysis; automatic detection; feature extraction; learning algorithm; likelihood ratio test; mel frequency coefficients; multilayer perceptron; parameter representation; pathological voices; probability of correct detection; short-time FFT; speaker identification; speaker recognition; supervised neural nets; temporal order; vocal diseases; vocal misuse; voice impairments; windowed voice signal; Acoustic signal detection; Diseases; Neural networks; Pathology; Signal analysis; Signal processing; Speaker recognition; Speech analysis; Speech processing; Surgery;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering in Medicine and Biology Society, 2001. Proceedings of the 23rd Annual International Conference of the IEEE
  • ISSN
    1094-687X
  • Print_ISBN
    0-7803-7211-5
  • Type

    conf

  • DOI
    10.1109/IEMBS.2001.1020549
  • Filename
    1020549