• DocumentCode
    668225
  • Title

    On aligning techniques, feature extraction and distance measures for Isolated Word Recognition

  • Author

    Valencia-Ramirez, Jose Maria ; Camarena-Ibarrola, A.

  • Author_Institution
    Div. de Estudios de Posgrado, Univ. Michoacana de San Nicolas de Hidalgo, Morelia, Mexico
  • fYear
    2013
  • fDate
    13-15 Nov. 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Discrete Hidden Markov Models (DHMM´s) are used in Automatic Speech Recognition (ASR) systems to model the dynamics of utterances as stochastic processes. Some researchers however prefer the use of Dynamic Time Warping (DTW) to deal with variations on the temporal evolution of utterances of the same word. Furthermore, some researchers in the field of ASR recommend the use of Mel frequency Cepstral Coefficients (MFCC) as the relevant features to be extracted from the speech signal while others use Linear Prediction Coefficients (LPC) for that matter. At evaluating the similarity of feature vectors we may use euclidean distance, cosine distance or the Itakura distance (in case of using LPC). We would like to know what combination of techniques should ASR developers use in the specific problem of Isolated Word Recognition. We implemented a number of ASR systems by changing the feature extraction module, the aligning techinque, the distance measure, or parameter´s values and compared them in order for the sake of those interested in developping Isolated Word recognition systems. In this paper we report the results of our experiments using Receiver Operating Characteristics (ROC) curves to show which ASR system achieved the highest recognition rate.
  • Keywords
    feature extraction; hidden Markov models; speech recognition; ASR systems; DHMM; ROC curves; aligning techniques; automatic speech recognition systems; discrete hidden Markov models; distance measures; feature extraction module; isolated word recognition; receiver operating characteristics; Automatic speech recognition; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Silicon; Smoothing methods; Vectors; Automatic Speech Recognition; DHMM; DTW; ROC curves;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Power, Electronics and Computing (ROPEC), 2013 IEEE International Autumn Meeting on
  • Conference_Location
    Mexico City
  • Type

    conf

  • DOI
    10.1109/ROPEC.2013.6702733
  • Filename
    6702733