• DocumentCode
    290003
  • Title

    Discriminative training of high performance speech recognizer using N best candidates

  • Author

    Chen, Jung-kuei ; Soong, Frank K.

  • Author_Institution
    Telecommun. Lab., Minist. of Commun., Chung-Li, Taiwan
  • Volume
    i
  • fYear
    1994
  • fDate
    19-22 Apr 1994
  • Abstract
    Proposes an N-best candidates based, discriminative training procedure for constructing high performance HMM speech recognizers. The algorithm has two features: (1) a new frame-level loss function; (2) N best candidates are used for training. The new frame-level loss function, defined as a rectified log likelihood difference between the correct and other competing hypotheses, is minimized over all training utterances. Two speech recognition applications have been tested: speaker independent, small vocabulary (10 Mandarin Chinese digits), continuous speech recognition; and a speaker-trained, large vocabulary (5,000 commonly used Chinese words), isolated word recognition. Significant performance improvement over the traditional maximum likelihood trained HMMs has been obtained. In the connected Chinese digit recognition experiment, the string error rate is reduced from 17% to 10.8% for unknown length decoding and from 8.2% to 5.2% for known length decoding. In the large vocabulary, isolated word recognition experiment, the recognition error rate is improved from 6.8% to 3.8%
  • Keywords
    decoding; hidden Markov models; speech recognition; Mandarin Chinese; N best candidates; connected Chinese digit recognition experiment; continuous speech recognition; decoding; discriminative training procedure; frame-level loss function; high performance HMM speech recognizers; high performance speech recognizer; isolated word recognition; recognition error rate; rectified log likelihood difference; string error rate; training utterance; Error analysis; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood estimation; Probability density function; Speech recognition; Vector quantization; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
  • Conference_Location
    Adelaide, SA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-1775-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1994.389216
  • Filename
    389216