• DocumentCode
    25290
  • Title

    Latent Semantic Rational Kernels for Topic Spotting on Conversational Speech

  • Author

    Chao Weng ; Thomson, David L. ; Haffner, Patrick ; Juang, Biing-Hwang Fred

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
  • Volume
    22
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    1738
  • Lastpage
    1749
  • Abstract
    In this work, we propose latent semantic rational kernels (LSRK) for topic spotting on conversational speech. Rather than mapping the input weighted finite-state transducers (WFSTs) onto a high dimensional n-gram feature space as in n-gram rational kernels, the proposed LSRK maps the WFSTs onto a latent semantic space. With the proposed LSRK, all available external knowledge and techniques can be flexibly integrated into a unified WFST based framework to boost the topic spotting performance. We present how to generalize the LSRK using tf-idf weighting, latent semantic analysis, WordNet and probabilistic topic models. To validate the proposed LSRK framework, we conduct the topic spotting experiments on two datasets, Switchboard and AT&T HMIHY0300 initial collection. The experimental results show that with the proposed LSRK we can achieve significant and consistent topic spotting performance gains over the n-gram rational kernels.
  • Keywords
    information analysis; probability; speech recognition; AT&T HMIHY0300 initial collection; LSRK; WFST; WordNet; conversational speech; dimensional n-gram feature space; input weighted finite-state transducers; latent semantic analysis; latent semantic rational kernels; latent semantic space; n-gram rational kernels; probabilistic topic models; switchboard; topic spotting; Kernel; Probabilistic logic; Semantics; Speech; Speech processing; Transducers; Vectors; LDA; LSA; PLSA; WFSTs; rational kernels; tf-idf; topic spotting;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2347133
  • Filename
    6877669