• DocumentCode
    3485423
  • Title

    The IBM 2011 GALE Arabic speech transcription system

  • Author

    Mangu, Lidia ; Kuo, Hong-Kwang ; Chu, Stephen ; Kingsbury, Brian ; Saon, George ; Soltau, Hagen ; Biadsy, Fadi

  • Author_Institution
    IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2011
  • fDate
    11-15 Dec. 2011
  • Firstpage
    272
  • Lastpage
    277
  • Abstract
    We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 5 machine translation evaluation. Key advances over our Phase 4 system include a new Bayesian Sensing HMM acoustic model; multistream neural network features; a MADA vowelized acoustic model; and the use of a variety of language model techniques with significant additive gains. These advances were instrumental in achieving a word error rate of 7.4% on the Phase 5 evaluation set, and an absolute improvement of 0.9% word error rate over our 2009 system on the unsequestered Phase 4 evaluation data.
  • Keywords
    Bayes methods; hidden Markov models; neural nets; speech processing; Bayesian sensing HMM acoustic model; GALE Phase 5 machine translation evaluation; IBM 2011 GALE Arabic speech transcription system; MADA vowelized acoustic model; language model techniques; multistream neural network features; phase 4 system; unsequestered phase 4 evaluation data; Acoustics; Computational modeling; Dictionaries; Hidden Markov models; Lattices; Training; Transforms; large vocabulary speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
  • Conference_Location
    Waikoloa, HI
  • Print_ISBN
    978-1-4673-0365-1
  • Electronic_ISBN
    978-1-4673-0366-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2011.6163943
  • Filename
    6163943