• DocumentCode
    677856
  • Title

    Neural Combinatorial Learning of Goal-Directed Behavior with Reservoir Critic and Reward Modulated Hebbian Plasticity

  • Author

    Dasgupta, S. ; Worgotter, Florentin ; Morimoto, Jun ; Manoonpong, Poramate

  • Author_Institution
    Bernstein Center for Comput. Neurosci. (BCCN), Georg-August-Univ., Gottingen, Germany
  • fYear
    2013
  • fDate
    13-16 Oct. 2013
  • Firstpage
    993
  • Lastpage
    1000
  • Abstract
    Learning of goal-directed behaviors in biological systems is broadly based on associations between conditional and unconditional stimuli. This can be further classified as classical conditioning (correlation-based learning) and operant conditioning (reward-based learning). Although traditionally modeled as separate learning systems in artificial agents, numerous animal experiments point towards their co-operative role in behavioral learning. Based on this concept, the recently introduced framework of neural combinatorial learning combines the two systems where both the systems run in parallel to guide the overall learned behavior. Such a combinatorial learning demonstrates a faster and efficient learner. In this work, we further improve the framework by applying a reservoir computing network (RC) as an adaptive critic unit and reward modulated Hebbian plasticity. Using a mobile robot system for goal-directed behavior learning, we clearly demonstrate that the reservoir critic outperforms traditional radial basis function (RBF) critics in terms of stability of convergence and learning time. Furthermore the temporal memory in RC allows the system to learn partially observable markov decision process scenario, in contrast to a memory less RBF critic.
  • Keywords
    Hebbian learning; Markov processes; convergence; mobile robots; radial basis function networks; stability; RC; adaptive critic unit; convergence stability; goal-directed behavior learning; learning time stability; memoryless RBF critic; mobile robot system; neural combinatorial learning; partially observable Markov decision process; radial basis function; reservoir computing network; reservoir critic; reward modulated Hebbian plasticity; temporal memory; Green products; Learning systems; Mobile robots; Neurons; Reservoirs; Robot sensing systems; Correlation learning; Re-inforcement learning; Reservoir networks; Temporal memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
  • Conference_Location
    Manchester
  • Type

    conf

  • DOI
    10.1109/SMC.2013.174
  • Filename
    6721927