• DocumentCode
    591880
  • Title

    Reinforcement learning for spoken dialogue systems using off-policy natural gradient method

  • Author

    Jurcicek, F.

  • Author_Institution
    Fac. of Math. & Phys., Charles Univ. in Prague, Prague, Czech Republic
  • fYear
    2012
  • fDate
    2-5 Dec. 2012
  • Firstpage
    7
  • Lastpage
    12
  • Abstract
    Reinforcement learning methods have been successfully used to optimise dialogue strategies in statistical dialogue systems. Typically, reinforcement techniques learn on-policy i.e., the dialogue strategy is updated online while the system is interacting with a user. An alternative to this approach is off-policy reinforcement learning, which estimates an optimal dialogue strategy offline from a fixed corpus of previously collected dialogues. This paper proposes a novel off-policy reinforcement learning method based on natural policy gradients and importance sampling. The algorithm is evaluated on a spoken dialogue system in the tourist information domain. The experiments indicate that the proposed method learns a dialogue strategy, which significantly outperforms the baseline handcrafted dialogue policy.
  • Keywords
    gradient methods; importance sampling; interactive systems; learning (artificial intelligence); optimisation; speech-based user interfaces; travel industry; dialogue strategy optimisation; importance sampling; off-policy natural gradient method; off-policy reinforcement learning method; optimal dialogue strategy; spoken dialogue systems; statistical dialogue systems; tourist information domain; Gradient methods; History; Learning; Linear approximation; Stochastic processes; Training; POMDP; dialogue management; off-policy reinforcement learning; policy gradient methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2012 IEEE
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4673-5125-6
  • Electronic_ISBN
    978-1-4673-5124-9
  • Type

    conf

  • DOI
    10.1109/SLT.2012.6424161
  • Filename
    6424161