• DocumentCode
    2717619
  • Title

    Using Reward-weighted Regression for Reinforcement Learning of Task Space Control

  • Author

    Peters, Jan ; Schaal, Stefan

  • Author_Institution
    Southern California Univ., Los Angeles, CA
  • fYear
    2007
  • fDate
    1-5 April 2007
  • Firstpage
    262
  • Lastpage
    267
  • Abstract
    Many robot control problems of practical importance, including task or operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots
  • Keywords
    control engineering computing; intelligent control; learning (artificial intelligence); regression analysis; robots; EM-base reinforcement learning; complex high degree-of-freedom robots; immediate reward reinforcement learning; online learning control; operational space control; reward-weighted regression; robot control; task space control; Acceleration; Anthropomorphism; Control systems; Learning; Manipulators; Optimal control; Orbital robotics; Robot control; Robot kinematics; Robot sensing systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0706-0
  • Type

    conf

  • DOI
    10.1109/ADPRL.2007.368197
  • Filename
    4220842