Computer Science ›› 2010, Vol. 37 ›› Issue (12): 186-189.
Previous Articles Next Articles
CHEN Sheng-lei,GU Rui-jun,CHEN Geng,XUE Hui
Online:
Published:
Abstract: In recent years,policy gradient methods arouse extensive interests in reinforcement learning with its excellent convergence property. Natural gradient algorithms were investigated in this paper. To resolve the problem of low efficiency when estimating the gradient in present algorithms,TD(λ) method was used to approximate the value functions when estimating the gradient. The eligibility traces in TD(λ) make the propagation of learning experience more efficient.As a result, the variance in gradient estimation can be decreased and the convergence speed can be improved. The simulation experiment in cart pole balancing system demonstrates the effectiveness of the algorithm.
Key words: Policy gradient, Natural gradient,TD(λ) , Eligibility trace
CHEN Sheng-lei,GU Rui-jun,CHEN Geng,XUE Hui. Natural Gradient Reinforcement Learning Algorithm with TD(λ)[J].Computer Science, 2010, 37(12): 186-189.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2010/V37/I12/186
Cited