Computer Science ›› 2014, Vol. 41 ›› Issue (6): 239-242.doi: 10.11896/j.issn.1002-137X.2014.06.047
Previous Articles Next Articles
JIN Yu-jing,ZHU Wen-wen,FU Yu-chen and LIU Quan
[1] Sutton R S,Barto A G.Reinforcement Learning:An Introduc-tion[M].MIT Press,1998 [2] Busoniu L,Babuska R,DeSchutter B,et al.ReimforcementLearning and Dynamic Programming Using Function Approximators[M].Boca Raton,FL:CRC Press,2010 [3] Grondman I,Bus,oniu L,et al.A Survey of Actor-Critic Rein-forcement Learning:Standard and Natural Policy Gradients[J].IEEE Transactions on Systems,Man,and Cybernetics—Part C:Applications and Reviews,2012,42(6):1291-1307 [4] Barto A G,Sutton R S,Anderson C W.Neuronlike Adaptive E-lement That Can Solve Difficult Learning Control Problems[J].IEEE Trans Syst Man Cybern,1983,13:834-846 [5] Konda V R,Tsitsiklis J N.Actor-Critic Algorithms [C]∥Proceedings of Advances in Neural Information Processing Systems.2000 (下转第249页)(上接第242页) [6] Rosenstein M T,Barto A G.Supervised Learning Combinedwith an Actor-Critic Architecture[J].CMPSCI Technical Report 02-41.October 2002 [7] Peters J,Schaal S.Natural actor-critic[J].Neurocomputing,2008,71(7-9):1180-1190 [8] Bathnagar S,Sutton R S,Ghavamzadeh M,et al.Natural actor-critic algorithms[J].Automatica,2009,45(11):2471-2482 [9] Vamvoudakis K G,Lewis F L.Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem[J].Automatica,2010,46(5):878-888 [10] Grondman I,Vaandrager M,Busoniu L,et al.Efficient ModelLearning Methods for Actor-Critic Control[J].IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics,2012,42(3):591-602 [11] Grondman I,Vaandrager M,Busoniu L,et al.Actor-Critic Control with Reference Model Learning[C]∥Proceedings of the 18th IFAC World Congress.Milan,Italy,2011:14723-14728 [12] Kuvayev L,Sutton R.Model-Based Reinforcement Learningwith an Approximate,Learned Model[C]∥Proceedings of the Ninth Yale Workshop on Adaptive and Learning Systems.1996:101-105 [13] Goschin W S,Littman M.Integrating sample-based planning and model-based reinforcement learning[C]∥Proc.Assoc.Adv.Artif.Intell..Atlanta,GA,2010:612-617 [14] Santamaria J,Sutton R,Ram A.Experiments with reinforcement learning in problems with continuous state and action spaces[J].Adaptive Bechavior,1998,6:163-138 [15] Sherstov A A,Stone P.Function Approximation via Tile Co-ding:Automating parameter choice[C]∥Zucker J-D,Saitta L,eds.SARA,volume 3607of Lecture Notes in Computer Science.Springer,2005:194-205 [16] Lanzi P L,Loiacono D,Wilson S W,et al.Classifier Prediction based on Tile Coding[C]∥Proceedings of the 2006Geneticand Evolutionary Computation Conference Workshop Program(GECCO 2006).Seattle,Washington,2006:1497-1504 |
No related articles found! |
|