• DocumentCode
    180172
  • Title

    I-vector based language modeling for spoken document retrieval

  • Author

    Kuan-Yu Chen ; Hung-Shin Lee ; Hsin-Min Wang ; Chen, Bing ; Hsin-Hsi Chen

  • Author_Institution
    Inst. of Inf. Sci., Taipei, Taiwan
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    7083
  • Lastpage
    7088
  • Abstract
    Since more and more multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research subject in the past two decades. The i-vector based framework has been proposed and introduced to language identification (LID) and speaker recognition (SR) tasks recently. The major contribution of the i-vector framework is to reduce a series of acoustic feature vectors of a speech utterance to a low-dimensional vector representation, and then numbers of well-developed postprocessing techniques (such as probabilistic linear discriminative analysis, PLDA) can be readily and effectively used. However, to our best knowledge, there is no research up to date on applying the i-vector framework for SDR or information retrieval (IR). In this paper, we make a step forward to formulate an i-vector based language modeling (IVLM) framework for SDR. Furthermore, we evaluate the proposed IVLM framework with both inductive and transductive learning strategies. We also exploit multi-levels of index features, including word- and subword-level units, in concert with the proposed framework. The results of SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the performance merits of our proposed framework when compared to several existing approaches.
  • Keywords
    information retrieval; learning by example; probability; speaker recognition; IVLM framework; LID; PLDA; SDR; TDT-2 collection; acoustic feature vector; i-vector based framework; i-vector based language modeling framework; inductive learning strategy; information retrieval; language identification; low-dimensional vector representation; multimedia data; postprocessing techniques; probabilistic linear discriminative analysis; speaker recognition task; speech utterance; spoken document retrieval; subword-level unit; topic detection and tracking collection; transductive learning strategy; Context; Indexes; Information retrieval; Probabilistic logic; Semantics; Training; Vectors; Spoken document retrieval; i-vector; inductive; language modeling; transductive;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854974
  • Filename
    6854974