• DocumentCode
    1320084
  • Title

    Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process

  • Author

    Pan, Yi-Cheng ; Lee, Hung-yi ; Lee, Lin-shan

  • Author_Institution
    MediaTek, Inc., Hsinchu, Taiwan
  • Volume
    20
  • Issue
    2
  • fYear
    2012
  • Firstpage
    632
  • Lastpage
    645
  • Abstract
    Interaction with users is a powerful strategy that potentially yields better information retrieval for all types of media, including text, images, and videos. While spoken document retrieval (SDR) is a crucial technology for multimedia access in the network era, it is also more challenging than text information retrieval because of the inevitable recognition errors. It is therefore reasonable to consider interactive functionalities for SDR systems. We propose an interactive SDR approach in which given the user´s query, the system returns not only the retrieval results but also a short list of key terms describing distinct topics. The user selects these key terms to expand the query if the retrieval results are not satisfactory. The entire retrieval process is organized around a hierarchy of key terms that define the allowable state transitions; this is modeled by a Markov decision process, which is popularly used in spoken dialogue systems. By reinforcement learning with simulated users, the key terms on the short list are properly ranked such that the retrieval success rate is maximized while the number of interactive steps is minimized. Significant improvements over existing approaches were observed in preliminary experiments performed on information needs provided by real users. A prototype system was also implemented.
  • Keywords
    Markov processes; information needs; information retrieval; interactive systems; learning (artificial intelligence); multimedia computing; text analysis; Markov decision process; SDR system; inevitable recognition error; information needs; interactive functionality; interactive spoken document retrieval success rate; multimedia access; reinforcement learning; spoken dialogue system; text information retrieval process; user query; Economics; Information retrieval; Markov processes; Navigation; Prototypes; Speech; Videos; Spoken document retrieval (SDR); dialogue system;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2163512
  • Filename
    6018284