• DocumentCode
    2180554
  • Title

    Generating compound words with high order n-gram information in large vocabulary speech recognition systems

  • Author

    Jie Zhou ; Shi, Qin ; Qin, Yang

  • Author_Institution
    IBM Res. - China, Beijing, China
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5560
  • Lastpage
    5563
  • Abstract
    In this work we concentrate on generating compound words with high order n-gram information for speech recognition. In most existing compound words generation methods, only bi-gram information is considered. They are successful for improving the performance of bi-gram models but doesn´t work well in higher order n-gram cases. Since nowadays 3-gram and 4-gram language models are commonly used, here we present a high order n-gram based computation to generate compound words automatically in an exact way which is called gradient criterion. We have this method tested on Mandarin Open Voice Search (OVS) task and make 0.62% absolute improvement over the 16.44% baseline. This result also outperforms the traditional mutual information based methods. Further the history effect and prediction effect of this criterion are tested and we find history effect plays a more important role in the decoding task.
  • Keywords
    speech recognition; 3-gram language models; 4-gram language models; OVS; high order N-gram information; large vocabulary speech recognition systems; open voice search; compound words; gradient criterion; high order; speech recognition; vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947619
  • Filename
    5947619