• DocumentCode
    52154
  • Title

    Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs

  • Author

    Hung-yi Lee ; Lin-Shan Lee

  • Author_Institution
    Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan
  • Volume
    22
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan. 2014
  • Firstpage
    80
  • Lastpage
    94
  • Abstract
    In a text context, document/query expansion has proven very useful in retrieving objects semantically related to the query. However, when applying text-based techniques on spoken content, the inevitable recognition errors seriously degrade performance even when the retrieval process is performed over lattices. We propose the estimation of more accurate term distributions (or unigram language models) for the spoken documents by acoustic similarity graphs. In this approach, a graph is constructed for each term describing the acoustic similarity among all signal regions hypothesized to be the considered term. Score propagation based on a random walk over the graph offers more reliable scores of the term hypotheses, which in turn yield more accurate term distributions (or unigram language models). This approach was applied with the language modeling retrieval approach, including using document expansion based on latent topic analysis and query expansion with a query-regularized mixture model. We extend these approaches from words to subword n-grams, and the query expansion from document-level to utterance-level and from term-based to topic-based. Experiments performed on Mandarin broadcast news showed improved performance under almost all tested conditions.
  • Keywords
    content-based retrieval; graph theory; natural language processing; query processing; Mandarin broadcast news; acoustic similarity graphs; document expansion; document-level query expansion; improved semantic spoken content retrieval; language modeling retrieval approach; latent topic analysis; object retrieval; query expansion; query-regularized mixture model; random walk; recognition errors; score propagation; signal regions; subword n-grams; term distributions; term-based query expansion; text-based techniques; topic-based query expansion; unigram language models; utterance-level query expansion; Acoustics; Analytical models; Estimation; Information retrieval; Lattices; Materials; Semantics; Document expansion; latent semantic analysis; query expansion; random walk; spoken content retrieval;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2013.2285469
  • Filename
    6633083