Computer Science ›› 2010, Vol. 37 ›› Issue (12): 186-189.

Previous Articles     Next Articles

Natural Gradient Reinforcement Learning Algorithm with TD(λ)

CHEN Sheng-lei,GU Rui-jun,CHEN Geng,XUE Hui   

  • Online:2018-12-01 Published:2018-12-01

Abstract: In recent years,policy gradient methods arouse extensive interests in reinforcement learning with its excellent convergence property. Natural gradient algorithms were investigated in this paper. To resolve the problem of low efficiency when estimating the gradient in present algorithms,TD(λ) method was used to approximate the value functions when estimating the gradient. The eligibility traces in TD(λ) make the propagation of learning experience more efficient.As a result, the variance in gradient estimation can be decreased and the convergence speed can be improved. The simulation experiment in cart pole balancing system demonstrates the effectiveness of the algorithm.

Key words: Policy gradient, Natural gradient,TD(λ) , Eligibility trace

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!