• DocumentCode
    2814698
  • Title

    Recognition of Phonemes In a Continuous Speech Stream By Means of PARCOR Parameter In LPC Vocoder

  • Author

    Cui, Ying ; Takaya, Kunio

  • Author_Institution
    Univ. of Saskatchewan, Saskatoon
  • fYear
    2007
  • fDate
    22-26 April 2007
  • Firstpage
    1606
  • Lastpage
    1609
  • Abstract
    Linear Predictive Coding (LPC) has been used to compress and encode speech signals for digital transmission at a low bit rate. LPC determines a FIR system that predicts a speech sample from the past samples by minimizing the squared error between the actual occurrence and the estimated. The coefficients of the FIR system are encoded and sent. At the receiving end, the inverse system called AR model is excited by a random signal to reproduce the encoded speech. The use of LPC can be extended to speech recognition since the FIR coefficients are the condensed information of a speech signal of typically 10ms -30ms. PARCOR parameter associated with LPC that represents a vocal tract model based on a lattice filter structure is considered for speech recognition. The use of FIR coefficients and the frequency response of AR model were previously investigated. [1] This paper reports the method to detect a limited number of phonemes from a continuous stream of speech. A system being developed slides a time window of 16 ms and calculates the PARCOR parameters continuously, feeding them to a classifier. A classifier is a supervised classifier that requires training. The classifier uses the Maximum Likelihood Decision Rule. The training uses TIMIT speech database, which contains the recordings of 630 speakers of 8 major dialects of American English. The classification results of some typical vowel and consonant phonemes segmented from the continuous speech are listed. The vowel and consonant correct classification rate are 65.22% and 93.51%. Overall, They indicate that the PARCOR parameters have the potential capability to characterize the phonemes.
  • Keywords
    FIR filters; autoregressive processes; linear predictive coding; maximum likelihood decoding; speech coding; speech recognition; vocoders; AR model; American English; FIR system; LPC vocoder; PARCOR parameter; TIMIT speech database; continuous speech stream; frequency response; lattice filter; linear predictive coding; maximum likelihood decision rule; phonemes recognition; speech recognition; speech signal compression; speech signals; supervised classifier; vocal tract model; Bit rate; Finite impulse response filter; Frequency response; Lattices; Linear predictive coding; Maximum likelihood detection; Maximum likelihood estimation; Speech coding; Speech recognition; Vocoders;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical and Computer Engineering, 2007. CCECE 2007. Canadian Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    0840-7789
  • Print_ISBN
    1-4244-1020-7
  • Electronic_ISBN
    0840-7789
  • Type

    conf

  • DOI
    10.1109/CCECE.2007.402
  • Filename
    4233061