• DocumentCode
    3144485
  • Title

    Task decomposition and dynamic policy merging in the distributed Q-learning classifier system

  • Author

    Chapman, Kevin L. ; Bay, John S.

  • Author_Institution
    Bradley Dept. of Electr. Eng., Virginia Polytech. Inst. & State Univ., Blacksburg, VA, USA
  • fYear
    1997
  • fDate
    10-11 Jul 1997
  • Firstpage
    166
  • Lastpage
    171
  • Abstract
    A distributed reinforcement learning system is designed and implemented on a mobile robot for the study of complex task decomposition and dynamic policy merging in real robot learning environments. The distributed Q-learning classifier system (DBLCS) is evolved from the standard LCS proposed by Holland (1996). We address two of the limitations of the LCS through the use of Q-learning as the apportionment of credit component and a distributed learning architecture to facilitate complex task decomposition. The Q-learning update equation is derived and its advantages over the complex bucket brigade algorithm (BBA) are discussed. Holistic and monolithic shaping approaches are used to distribute reward among the learning modules of the DBLCS and allow dynamic policy merging in a variety of real robot learning experiments
  • Keywords
    learning (artificial intelligence); minimisation; mobile robots; path planning; apportionment of credit; complex bucket brigade algorithm; distributed Q-learning classifier system; distributed reinforcement learning system; dynamic policy merging; holistic shaping; mobile robot; monolithic shaping; task decomposition; Artificial intelligence; Equations; Intelligent robots; Learning systems; Machine learning; Merging; Mobile robots; Robot sensing systems; State-space methods; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Robotics and Automation, 1997. CIRA'97., Proceedings., 1997 IEEE International Symposium on
  • Conference_Location
    Monterey, CA
  • Print_ISBN
    0-8186-8138-1
  • Type

    conf

  • DOI
    10.1109/CIRA.1997.613854
  • Filename
    613854