• DocumentCode
    3585065
  • Title

    EM-based phoneme confusion matrix generation for low-resource spoken term detection

  • Author

    Di Xu ; Yun Wang ; Metze, Florian

  • Author_Institution
    Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2014
  • Firstpage
    424
  • Lastpage
    429
  • Abstract
    The idea of using a data-driven phoneme confusion matrix (PCM) to enhance speech recognition and retrieval performance is not new to the speech community. Although empirical results show various degrees of improvements brought by introducing a PCM, the underlying data-driven processes introduced in most papers are rather ad-hoc and lack rigorous statistical justifications. In this paper we will focus on the statistical aspects of PCM generation, propose and justify a novel expectation-maximization based algorithm for data-driven PCM generation. We will evaluate the performance of the generated PCMs under the context of low-resource spoken term detection, with primary focus on out-of-vocabulary keywords.
  • Keywords
    expectation-maximisation algorithm; information retrieval; matrix algebra; speech recognition; statistical analysis; EM-based phoneme confusion matrix generation; data-driven PCM generation; data-driven phoneme confusion matrix; data-driven processes; expectation-maximization based algorithm; low-resource spoken term detection; out-of-vocabulary keywords; speech community; speech recognition; speech retrieval performance; statistical aspects; Estimation; Optimization; Phase change materials; Probabilistic logic; Speech; Speech recognition; Viterbi algorithm; Expectation-maximization algorithm; information retrieval; machine learning; out-of-vocabulary words; spoken term detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2014 IEEE
  • Type

    conf

  • DOI
    10.1109/SLT.2014.7078612
  • Filename
    7078612