• DocumentCode
    3529748
  • Title

    Discriminative pronounciation learning using phonetic decoder and minimum-classification-error criterion

  • Author

    Vinyals, Oriol ; Deng, Li ; Yu, Dong ; Acero, Alex

  • Author_Institution
    Int. Comput. Sci. Inst., Berkeley, CA
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4445
  • Lastpage
    4448
  • Abstract
    In this paper, we report our recent research aimed at improving the pronunciation-modeling component of a speech recognition system designed for mobile voice search. Our new discriminative learning technique overcomes the limitation of the traditional ways of introducing alternative pronunciations that often enlarge confusability across different lexical items. Instead, we make use of a phonetic recognizer to generate pronunciation candidates, which are then evaluated and selected using the global minimum-classification-error measure, guaranteeing a reduction of the training-set error rate after introducing alternative pronunciations. A maximum entropy approach is subsequently used to learn the weight parameters of the selected pronunciation candidates. Our experimental results demonstrate the effectiveness of the discriminative pronunciation learning technique in a real-world speech recognition task where pronunciation of business names presents special difficulty for high-accuracy speech recognition.
  • Keywords
    grammars; maximum entropy methods; mobile communication; signal classification; speech coding; speech recognition; voice communication; business name; discriminative pronunciation learning; maximum entropy; minimum-classification-error criterion; mobile voice search; phonetic decoder; phonetic recognizer; pronunciation modeling; speech recognition system; Acoustic waves; Automatic speech recognition; Computer science; Dictionaries; Entropy; Error analysis; Lattices; Learning systems; Maximum likelihood decoding; Speech recognition; MCE objective function; Pronunciation modeling; discriminative learning; greedy search; phonetic decoding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960616
  • Filename
    4960616