• DocumentCode
    260235
  • Title

    Improving the performance of Q-learning using simultanouse Q-values updating

  • Author

    Pouyan, Maryam ; Mousavi, Amin ; Golzari, Shahram ; Hatam, Ahmad

  • Author_Institution
    Electr. & Comput. Eng. Dept., Hormozgan Univ., Bandarabbas, Iran
  • fYear
    2014
  • fDate
    26-27 Nov. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Q-learning is a one of the best model-free reinforcement learning algorithms. The goal is to find an estimate of the optimal action-value function called Q-value function. The Q-value function is defined as the expected sum of future rewards obtained by taking an action in the current state. The main drawback of Q-learning is that the learning process is expensive for the agent, specially, in the beginning steps. Because, every state-action pair should be visited frequently in order to converge to the optimal policy. In this paper, the concept of opposite action is used to improve the performance of the Q-learning algorithm, especially, in the beginning steps of the learning. Opposite actions suggest updating two Q-values, simultaneously. The agent will update Q-value for each action and corresponding opposite action and thus increasing the speed of learning. The novel Q-learning method based on the concept of opposite action is simulated for the famous test-bed grid world problem. The results show the ability of the proposed method to improve the learning process.
  • Keywords
    learning (artificial intelligence); optimisation; Q-learning; Q-value function; optimal action-value function; reinforcement learning algorithm; Computational intelligence; Computers; Convergence; Educational institutions; Knowledge engineering; Learning (artificial intelligence); Standards; Q-leaming; estimate value; opposite action; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Technology, Communication and Knowledge (ICTCK), 2014 International Congress on
  • Conference_Location
    Mashhad
  • Type

    conf

  • DOI
    10.1109/ICTCK.2014.7033528
  • Filename
    7033528