• DocumentCode
    353507
  • Title

    Heterogeneous lexical units for automatic speech recognition: preliminary investigations

  • Author

    Bazzi, Issam ; Glass, James

  • Author_Institution
    Lab. for Comput. Sci., MIT, Cambridge, MA, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1257
  • Abstract
    This paper explores the use of the phone and syllable as primary units of representation in the first stage of a two-stage recognizer. A finite-state transducer speech recognizer is utilized to configure the recognition as a two-stage process, where either phone or syllable graphs are computed in the first stage, and passed to the second stage to determine the most likely word hypotheses. Preliminary experiments in a weather information speech understanding domain show that a syllable representation with either bigram or trigram language models provides more constraint than a phonetic representation with a higher-order n-gram language model (up to a 6-gram), and approaches the performance of a more conventional single-stage word-based configuration
  • Keywords
    graph theory; speech recognition; automatic speech recognition; bigram language model; finite-state transducer speech recognizer; graphs; heterogeneous lexical units; performance; phone; representation; syllable; trigram language model; two-stage process; two-stage recognizer; weather information speech understanding domain; word hypotheses; Automatic speech recognition; Computer science; Glass; Information systems; Laboratories; Natural languages; Speech processing; Speech recognition; Transducers; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.861804
  • Filename
    861804