• DocumentCode
    2837933
  • Title

    A comparative study on various confidence measures in large vocabulary speech recognition

  • Author

    Guo, Gang ; Huang, Chao ; Jiang, Hui ; Wang, Ren-Hua

  • Author_Institution
    Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    9
  • Lastpage
    12
  • Abstract
    In this paper, we have conducted a comparative study on several confidence measures (CM) for large vocabulary speech recognition. Firstly, we propose a novel high-level CM that is based on the inter-word mutual information (MI). Secondly, we experimentally investigate several popular low-level CM, such as word posterior probabilities, N-best counting, likelihood ratio testing (LRT), etc. Finally, we have studied a simple linear interpolation strategy to combine the best low-level CM with the best high-level CM. All of these CM are examined in two large vocabulary ASR tasks, namely the Switchboard task and a Mandarin dictation task, to verify the recognition errors in baseline recognition systems. Experimental results show: (1) the proposed MI-based CM greatly surpass another existing high-level CM which are based on the LSA technique; (2) among all low-level CM, word posteriori probabilities give the best verification performance; (3) when combining the word posteriori probabilities with the MI-based CM, the equal error rate is reduced from 24.4% to 23.9% in the Switchboard task and from 17.5% to 16.2% in the Mandarin dictation task.
  • Keywords
    error statistics; interpolation; maximum likelihood estimation; speech recognition; vocabulary; ASR; LSA technique; Mandarin dictation task; N-best counting; Switchboard task; baseline recognition systems; equal error rate; high-level confidence measures; inter-word mutual information; large vocabulary speech recognition; likelihood ratio testing; linear interpolation; recognition errors; verification performance; word posterior probabilities; Acoustic measurements; Asia; Automatic speech recognition; Chaos; Collision mitigation; Humans; Light rail systems; Probability; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409573
  • Filename
    1409573