• DocumentCode
    1242823
  • Title

    Isolated Mandarin syllable recognition using segmental features

  • Author

    Chang, S. ; Chen, S.-H.

  • Author_Institution
    Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
  • Volume
    142
  • Issue
    1
  • fYear
    1995
  • fDate
    2/1/1995 12:00:00 AM
  • Firstpage
    59
  • Lastpage
    64
  • Abstract
    A segment-based speech recognition scheme is proposed. The basic idea is to model explicitly the correlation among successive frames of speech signals by using features representing contours of spectral parameters. The speech signal of an utterance is regarded as a template formed by directly concatenating a sequence of acoustic segments. Each constituent acoustic segment is of variable length in nature and represented by a fixed dimensional feature vector formed by coefficients of discrete orthonormal polynomial expansions for approximating its spectral parameter contours. In the training, an automatic algorithm is proposed to generate several segment-based reference templates for each syllable class. In the testing, a frame-based dynamic programming procedure is employed to calculate the matching score of comparing the test utterance with each reference template. Performance of the proposed scheme was examined by simulations on multi-speaker speech recognition for 408 highly confusing isolated Mandarin base-syllables. A recognition rate of 81.1% was achieved for the case using 5-segment, 8-reference template models with cepstral and delta-cepstral coefficients as the recognition features. It is 4.5% higher than that of a well-modelled 12-state, 5-mixture CHMM method using cepstral, delta cepstral, and delta-delta cepstral coefficients
  • Keywords
    cepstral analysis; dynamic programming; feature extraction; natural languages; speech recognition; acoustic segments; automatic algorithm; cepstral coefficients; coefficients; correlation; delta-cepstral coefficients; discrete orthonormal polynomial expansions; dynamic programming; feature vector; isolated Mandarin syllable recognition; matching score; multi-speaker speech recognition; performance; recognition rate; reference template; segmental features; simulations; spectral parameter contours; speech signal; test utterance; testing; training;
  • fLanguage
    English
  • Journal_Title
    Vision, Image and Signal Processing, IEE Proceedings -
  • Publisher
    iet
  • ISSN
    1350-245X
  • Type

    jour

  • DOI
    10.1049/ip-vis:19951648
  • Filename
    363600