• DocumentCode
    2368784
  • Title

    The Improvement on Reinforcement Learning for SCM by the Agent Policy Mapping

  • Author

    Sun, Ruoying ; Zhao, Gang ; Li, Chen ; Tatsumi, Shoji

  • Author_Institution
    Beijing Inf. Sci. & Technol. Univ.
  • fYear
    2006
  • fDate
    6-10 Nov. 2006
  • Firstpage
    3585
  • Lastpage
    3590
  • Abstract
    The reinforcement learning (RL) is an efficient and popular way for solving problems that an agent has no knowledge about the environment a priori, which owns two characteristics: trial-and-error and delayed rewards. An RL agent must derive an optimal policy by directly interacting with the environment and getting the information about the environment. Supply chain management (SCM) is a challenging problem for the agent-based electronic business. Some proposed RL methods perform better than traditional tools for dynamic problem solving in SCM. It realizes on-line learning and performs efficiently in some applications, but RL agent reacts worse than some heuristic methods to sudden changes in SCM demand since the trial-and-error characteristic of RL is time-consuming in practice. By surveying an efficient policy transition mechanism in RL about how to mapping existing policies in the previous task to a new policies in a changed task, this paper proposes a novel RL agent based SCM system that decreases learning time of the RL agent to a dynamic environment. As the result, the RL agent derives the maximal profit using RL technique as jobs coming with a stable distribution. Further, the RL agent makes the optimal procurement satisfying the requirement of sudden changes in the supply chain network by the policy transition mechanism
  • Keywords
    learning (artificial intelligence); software agents; supply chain management; agent policy mapping; agent-based electronic business; heuristic methods; online learning; optimal procurement; policy transition mechanism; reinforcement learning; supply chain management; supply chain network; trial-and-error characteristics; Consumer electronics; Delay; Information science; Learning; NP-hard problem; Optimization methods; Problem-solving; Sun; Supply chain management; Supply chains;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    IEEE Industrial Electronics, IECON 2006 - 32nd Annual Conference on
  • Conference_Location
    Paris
  • ISSN
    1553-572X
  • Print_ISBN
    1-4244-0390-1
  • Type

    conf

  • DOI
    10.1109/IECON.2006.347360
  • Filename
    4153237