• DocumentCode
    1749605
  • Title

    Error corrective mechanisms for speech recognition

  • Author

    Mangu, Lidia ; Padmanabhan, Mukund

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    29
  • Abstract
    In the standard MAP approach to speech recognition, the goal is to find the word sequence with the highest posterior probability given the acoustic observation. A number of alternate approaches have been proposed for directly optimizing the word error rate, the most commonly used evaluation criterion. One of them, the consensus decoding approach, converts a word lattice into a confusion network which specifies the word-level confusions at different time intervals, and outputs the word with the highest posterior probability from each word confusion set. The paper presents a method for discriminating between the correct and alternate hypotheses in a confusion set using additional knowledge sources extracted from the confusion networks. We use transformation-based learning for inducing a set of rules to guide a better decision between the top two candidates with the highest posterior probabilities in each confusion set. The choice of this learning method is motivated by the perspicuous representation of the rules induced, which can provide insight into the cause of the errors of a speech recognizer. In experiments on the Switchboard corpus, we show significant improvements over the consensus decoding approach
  • Keywords
    error correction; graph theory; hidden Markov models; learning (artificial intelligence); natural languages; probability; set theory; speech recognition; Switchboard corpus; confusion network; consensus decoding approach; error corrective mechanisms; knowledge sources; posterior probability; speech recognition; speech recognizer; transformation-based learning; word confusion set; word error rate; word lattice; word-level confusions; Decision making; Decoding; Error analysis; Error correction; Lattices; Learning systems; Speech recognition; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940759
  • Filename
    940759