• DocumentCode
    2180494
  • Title

    Performance of connected digit recognizers with context-dependent word duration modeling

  • Author

    Kwon, Oh Wook ; Un, Chong Kwan

  • Author_Institution
    Spoken Language Processing Sect., ETRI, Taejon, South Korea
  • fYear
    1996
  • fDate
    18-21 Nov 1996
  • Firstpage
    243
  • Lastpage
    246
  • Abstract
    In a Korean connected digit recognizer, insertion and deletion errors amount to about half of the total recognition errors because there exists two monophonemic digits in the Korean language. Previous studies showed that these errors are not corrected even by discriminative training algorithms. To reduce those errors, we propose to model and incorporate context-dependent word duration information directly in a decoding algorithm. Experimental results show that while incorporating duration information in the postprocessing stage does not achieve significant improvements over a baseline system, the proposed method reduces word error rates by as much as 10% for unknown length decoding when the recognizer is trained by the maximum likelihood estimation and generalized probabilistic descent methods. Further simple duration modeling by a bounded uniform distribution shows it is possible to achieve performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution, and hence it is a good compromise between performance and complexity
  • Keywords
    Gaussian distribution; decoding; errors; gamma distribution; maximum likelihood estimation; probability; speech coding; speech recognition; Gaussian distribution; Korean language; bounded uniform distribution; connected digit recognizers; context-dependent word duration modeling; decoding algorithm; deletion errors; duration information; gamma distribution; generalized probabilistic descent method; insertion errors; maximum likelihood estimation; monophonemic digits; postprocessing stage; recognition errors; word error rates; Context modeling; Error analysis; Error correction; Gaussian distribution; Hidden Markov models; Maximum likelihood decoding; Natural languages; Pattern recognition; Probability distribution; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems, 1996., IEEE Asia Pacific Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    0-7803-3702-6
  • Type

    conf

  • DOI
    10.1109/APCAS.1996.569264
  • Filename
    569264