即时策略博弈在线对抗规划方法综述

doi:10.11896/jsjkx.210600168

Abstract

Abstract: Real-time strategy game online adversarial planning is a challenging problem in the field of multi-agent learning.In the process of game confrontation,in the face of an uncertain threat environment and non-stationary opponents,the agent needs to reason about the opponent’s actions within a limited time according to the game situation,make your own action plan quickly and perform adversarial planning in the huge state space and action space.The real-time strategy game platform is an ideal testbed for studying online adversarial planning problems.This paper firstly uses a typical real-time strategy game model to elicit the real-time strategy game confrontation problems,and classifies them into three levels and two operation control methods,and sorts out the five challenges faced from five sub-directions.Secondly,the current online adversarial planning methods are comprehensively reviewed and analyzed from three perspectives of tactical adversarial planning,strategic adversarial planning and mixed adversarial planning.Finally,the key issues that need to be studied in the future are pointed out from three key aspects:opponent and player modeling,human-machine collaborative online ad hoc planning,and learning-based planning.

Key words: Hierarchical task networks, Human-machine collaboration, Monte Carlo tree search, Online adversarial planning, Opponent modeling, RTS game

CLC Number:

TP181

LUO Jun-ren, ZHANG Wan-peng, LU Li-na, CHEN Jing. Survey on Online Adversarial Planning for Real-time Strategy Game[J].Computer Science, 2022, 49(6): 287-296.

References

[1] OMIDSHAFIEI S,TUYLS K,CZARNECKI W M,et al.Navigating the Landscape of Multiplayer Games[J].Nature Communications,2020,11(1):5603.
[2] VINYAS O,BABUSCHKIN I,CZARNECKI W M,et al.Grandmaster level in StarCraft II using Multi-agent Reinforcement Learning[J].Nature,2019,575(7782):350-354.
[3] DARPA.Gamebreaker AI Effort Gets Under Way,2020[EB/OL].(2020-05-13)[2021-05-30].https://www.darpa.mil/news-events/2020-05-13.
[4] KEITH A J.Operational Decision Making Under Uncertainty:Inferential,Sequential,and Adversarial Approaches[D].Air Force Institute of Technology,2020.
[5] HERNANDEZ-LEAL P,ZHAN Y,TAYLOR M E,et al.An Exploration Strategy for Non-stationary Opponents[J].Autonomous Agents and Multi-Agent Systems,2017,31(5):9711002.
[6] ALEXANDER K.Adversarial Reasoning:Computational Ap-proaches to Reading the Opponent’s Mind[M].Chapman & Hall/CRC Computer and Information Science Series,2006.
[7] GU W X,LIU Y.Adversarial Planning and Response[M].The Science Publishing Company,2016.
[8] GHALLAB M,NAU D,TRAVERSO P.Automated Planning and Acting[M].Cambridge University Press,2016.
[9] SUKTHANKAR G,GOLDMAN R P,GEIB C,et al.Plan,Activity,and Intent Recognition:Theory and Practice[M].Ai Magazine,2014.
[10] MIRSKY R,KEREN S,GEIB C.Introduction to Symbolic Plan and Goal Recognition[M].Morgan & Claypool publishers,2021.
[11] OWNBY M,KOTT A.Predicting Enemy’s Actions ImprovesCommander Decision-Making[C]//CCRTS.2016.
[12] ONTAÑÓN S.The Combinatorial Multi-armed Bandit Problem and Its Application to Real-time Strategy Games[C]//Procee-dings of the 9th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.2013:58-64.
[13] ONTAÑÓN S,SYNNAEVE G,URIARTE A,et al.A Survey of Real-time Strategy Game AI Research and Competition in Starcraft[J].IEEE Transactions on Computational Intelligence and AI in Games,2013,5(4):293-311.
[14] BARRIGA N A,STANESCU M,BURO M.Game Tree Search Based on Non-Deterministic Action Scripts in Real-Time Strategy Games[J].IEEE Transactions on Computational Intelligence and AI in Games,2017,10(1):69-77.
[15] ONTAÑÓN S.Experiments with Game Tree Search in Real-Time Strategy Games[EB/OL].(2012-08-01)[2021-05-06].http://arxiv.org/pdf/1208.1940.pdf.
[16] STANESCU M,BARRIGA N A,BURO M.Hierarchical Adversarial Search Applied to Real-Time Strategy[C]//National Conference on Artificial Intelligence.2014:66-72.
[17] ANDERSEN P A,GOODWIN M,GRANMO O C.Deep RTS:A Game Environment for Deep Reinforcement Learning in Real-time Strategy Games[C]//IEEE Conference on Computational Intelligence and Games (CIG).2018:1-8.
[18] SETHY H,PATEL A,PADMANABHAN V.Real Time Stra-tegy Games:A Reinforcement Learning Approach[J].Procedia Computer Science,2015,54:257-264.
[19] CHURCHILL D,SAFFIDINE A,BURO M.Fast heuristicsearch for RTS game combat scenarios[C]//Proceedings of the Eighth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.2012:8-14.
[20] CHURCHILL D,BURO M.Portfolio greedy search and simulation for large-scale combat in Starcraft[C]//Proceedings of the IEEE Conference on Computational Intelligence in Games.2013:1-8.
[21] ISHIHARA M,HARADA T,MIYAZAKI T,et al.Applyingand improving monte-carlo tree search in a fighting game AI[C]//Proceedings of the 13th International Conference on Advances in Computer Entertainment Technology.2016:1-6.
[22] BROWNE C B,WHITEHOUSE D,POWLEY E,et al.A Survey of Monte Carlo Tree Search Methods[J].IEEE Transactions on Computational Intelligence and AI in Games,2012,4(1):1-43.
[23] CHUNG M,BURO M,SCHAEFFER J.Monte-carlo Planning in RTS Games[C]//IEEE 2005 Symposium on Computational Intelligence and Games.2005:117-124.
[24] BURO M.ORTS:A Hack-free RTS Game Environment[C]//International Conference on Computers and Games.Berlin:Springer,2002:280-291.
[25] CHURCHILL D.Heuristic Search Techniques for Real-TimeStrategy Games[D].University of Alberta,2016.
[26] BALLA R K,FERN A.UCT for Tactical Assault Planning in Real-time Strategy Games[C]//International Joint Conference on Artificial Intelligence.2009:40-45.
[27] SOEMERS D.Tactical Planning Using MCTS in the Game ofStarCraft[D].Department of Knowledge Engineering,Maastricht University,2014.
[28] URIARTE A,ONTAÑÓN S.Game-tree Search over High-level Game States in RTS Games[C]//Proceedings of the 10th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.AIIDE,2014:73-79.
[29] URIARTE A,ONTAÑÓN S.High-level Representations forGame-Tree Search in RTS Games[C]//Proceedings of the 10th AAAI Conference on Artificial Intelligence and Interactive Di-gital Entertainment.AIIDE,RTS Workshop,2014.
[30] URIARTE A,ONTAÑÓN S.Improving Monte Carlo TreeSearch Policies in StarCraft via Probabilistic Models Learned from Replay Data[C]//Proceedings of the Tenth AAAI Confe-rence on Artificial Intelligence and Interactive Digital Entertainment.2016:100-106.
[31] URIARTE A.Adversarial Search and Spatial Reasoning in Real Time Strategy Games[D].University of Alberta,2017.
[32] SHLEYFMAN A,KOMENDA A,DOMSHLAK C.On Com-binatorial Actions and CMABs with Linear Side Information[J].Frontiers in Artificial Intelligence & Applications,2014,263:825-830.
[33] ONTAÑÓN S.Informed Monte Carlo Tree Search for Real-Time Strategy games[C]//IEEE Conference on Computational Intelligence and Games.2016:1-8.
[34] SIRONIC F,LIU J,PEREZ-LIEBANA D,et al.Self-Adaptive MCTS for General Video Game Playing Self-Adaptive MCTS for General Video Game Playing[C]//International Conference on the Applications of Evolutionary Computation.Cham:Springer,2018:358-375.
[35] AUER P,CESA-BIANHI N,FISCHER P.Finite-Time Analysis of the Multiarmed Bandit Problem[J].Machine Learning,2002,47(2/3):235-256.
[36] KOCSIS L,SZEPESVÁRI C.Bandit Based Monte-Carlo Planning[C]//European Conference on Machine Learning.Berlin:Springer,2006:282-293.
[37] KOCSIS L,SZEPESVÁRI C,WILLEMSON J.Improved Monte-Carlo Search[R].Univ.Tartu,Estonia,Tech.Rep,2006,1.
[38] ONTAÑÓN S.Combinatorial Multi-armed Bandits for Real-Time Strategy Games[J].Journal of Artificial Intelligence Research,2017,58:665-702.
[39] BARRIGA N A.Search,Abstractions and Learning in Real-Time Strategy Games[D].Drexel University,2017.
[40] JUSTESEN N,TILLMAN B,TOGELIUS J,et al.Script and Cluster-based UCT for StarCraft[C]//Computational Intelligence and Games.IEEE,2014:1-8.
[41] MORAES R O,LELIS L H S.Nested-Greedy Search for Adversarial Real-Time Games[C]//Proceedings of the 14th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.AIIDE,2018.
[42] WANG C,CHEN P,LI Y,et al.Portfolio Online Evolution inStarcraft[C]//Proceedings of the Twelfth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.2016:114-120.
[43] LELIS L H S.Stratified Strategy Selection for Unit Control in Real-Time Strategy Games[C]//Twenty-Sixth International Joint Conference on Artificial Intelligence.2017:3735-3741.
[44] MORAES R O,MARINO J R H,LELIS L H S,et al.ActionAbstractions for Combinatorial Multi-Armed Bandit Tree Search[C]//AAAI Publications.Fourteenth Artificial Intelligence and Interactive Digital Entertainment Conference,2018.
[45] SILVA C R,MORAES R O,LELIS L H S,et al.Strategy Ge-neration for Multiunit Real-Time Games via Voting[C]//IEEE Transactions on Games.2018:426-435.
[46] MARIÑO J R H,MORAES R O,TOLEDO C,et al.Evolving Action Abstractions for Real-Time Planning in Extensive-Form Games[C]//Proceedings of the Conference on Artificial Intelligence (AAAI).2019.
[47] BARRIGA N A,STANESCU M,BURO M.Combining Strategic Learning and Tactical Search in Real-Time Strategy Games[EB/OL].(2017-09-11)[2021-05-06].http://arxiv.org/pdf/1709.03480.pdf.
[48] GOPALAKRISHNAN S.Learning Hierarchical Task Networks Using Semantic Word Embeddings[D].Lehigh University,2017.
[49] ONTAÑÓN S,BURO M.Adversarial Hierarchical-Task Net-work Planning for Complex Real-Time Games[C]//Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence.2015:1652-1658.
[50] SUN L,JIAO P,KAI X,et al.Modified Adversarial Hierarchical Task Network Planning in Real-Time Strategy Games[J].Applied Sciences,2017,7(9):872.
[51] SUN L,ZHU A,LI B,et al.HTN Guided Adversarial Planning for RTS Games[C]//IEEE ICMA.IEEE,2020.
[52] NEUFELD X,MOSTAGHIM S,PEREZ-LIEBANA D.A Hybrid Planning and Execution Approach Through HTN and MCTS[J].The International Conference on Automated Planning and Scheduling Workshop on Integrating Planning,Acting,and Execution,2019,1:37-49.
[53] GEIB C W,GOLDMAN R P.Recognizing Plans with Loops Represented in a Lexicalized Grammar[C]//Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence(AAAI’11).2011:958-963.
[54] GEIB C,KANTHARAJU P.Learning Combinatory Categorial Grammars for Plan Recognition[C]//Proc.of the AAAI Conference on Artificial Intelligence.2018.
[55] KANTHARAJU P,ONTAÑÓN S,GEIB C.Extracting CCGsfor Plan Recognition in RTS Games[C]//Proceedings of the Workshop on Knowledge Extraction in Games.2019.
[56] KANTHARAJU P,ONTAÑÓN S,GEIB C.Scaling up CCG-Based Plan Recognition via Monte-Carlo Tree Search[C]//Proc.of the IEEE Conference on Games (CoG).2019.
[57] KANTHARAJU P,ONTAÑÓN S,GEIB C.μCCG,a CCG-based Game-Playing Agent for μRTS[C]//Proc.of IEEE Conference on Computational Intelligence in Games (CIG).2018.
[58] KANTHARAJU P.Learning Decomposition Models for Hierarchical Planning and Plan Recognition[D].Drexel University,2020.
[59] BARRIGA N A,STANESCU M,BURO M.Puppet Search:Enhancing Scripted Behavior by Look-Ahead Search with Applications to Real-Time Strategy Games[C]//Proceedings of the 11th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.AIIDE,2015.
[60] MORAES R O,LELIS L.Asymmetric Action Abstractions for Multi-Unit Control in Adversarial Real-Time Games[C]//The Thirty-Second AAAI Conference on Artificial Intelligence AAAI-18.2018:876-883.
[61] YANG Z,ONTAÑÓN S.Guiding Monte Carlo Tree Search by Scripts in Real-Time Strategy Games[C]//Proceedings of the Fifteenth AAAI Conference on Artificial Intelligence and Inte-ractive Digital Entertainment(AIIDE-19).2019.
[62] YANG Z,ONTAÑÓN S.Extracting Policies from Replays toImprove MCTS in Real Time Strategy Games[C]//Proceedings of the AAAI-2019 Knowledge Extraction from Games Workshop.2019.
[63] YANG Z,ONTAÑÓN S.An Empirical Survey on Methods for Integrating Scripts into Adversarial Search for RTS Games[J].IEEE Transactions on Games,2021(99):1-10.
[64] TAVARES A R,CHAIMOWICZ L.Tabular ReinforcementLearning in Real-Time Strategy Games via Options[C]//2018 IEEE Conference on Computational Intelligence and Games (CIG).2018.
[65] LU L,ZHANG W,GU X,et al.HMCTS-OP:HierarchicalMCTS Based Online Planning in the Asymmetric Adversarial Environment[J].Symmetry,2020,12(5):719.
[66] LELIS L H S.Planning Algorithms for Zero-Sum Games with Exponential Action Spaces:A Unifying Perspective[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20).2020.
[67] OUESSAI A,SALEM M,MORA A M.Online AdversarialPlanning in μRTS:A Survey[C]//2019 International Confe-rence on Theoretical and Applicative Aspects of Computer Scien-ce(ICTAACS).2019.
[68] ALBRECHT S V,STONE P.Autonomous Agents ModellingOther Agents:A Comprehensive Survey and Open Problems[J].Artificial Intelligence,2018,258:66-95.
[69] WEBER B G,MATEAS M.A Data Mining Approach to StrategyPrediction[C]//Proceedings of the IEEE Symposium on Computational Intelligence and Games.Milano,Italy,2009:140-147.
[70] YAMAMOTO K,MIZUNO S,CHU C Y,et al.Deduction ofFighting-Game Countermeasures Using the k-Nearest Neighbor Algorithm and a Game Simulator[C]//Proceedings of the IEEE Conference on Computational Intelligence and Games.Dortmund,Germany,2014:437-441.
[71] SCHADD F,BAKKES S,SPRONCK P.Opponent Modeling in Real-Time Strategy Games[C]//GAMEON.2007:61-70.
[72] OMER E,VOLKAN S.Learning Strategies for Opponent Mo-deling in Poker[C]//Workshop of the Twenty-Seventh AAAI Conference on Artificial Intelligence.2013.
[73] ZHANG Y,RADULESCU R,MANNION P.Opponent Modelling using Policy Reconstruction for Multi-Objective Normal Form Games[C]//ALA.2020.
[74] DOSHI P,ZENG Y,CHEN Q.Graphical models for interactive POMDPs:representations and solutions[J].Autonomous Agents and Multi-Agent Systems,2009,18(3):376.
[75] RUSCH T,STEIXNER-KUMAR S,DOSHI P,et al.Theory of Mind and Decision Science:Towards a Typology of Tasks and Computational Models[J].Neuropsychologia,2020,146(211):107488.
[76] BARD N,JOHANSON M,BURCH N,et al.Online ImplicitAgent Modelling[C]//Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems.AAMAS,2013.
[77] DAVID M,JAKUB C,VILIAM L,et al.Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games[C]//AAAI.2021.
[78] VALLS-VARGAS J,ONTAÑÓN S,ZHU J.Exploring PlayerTrace Segmentation for Dynamic Play Style Prediction[C]//Proceedings of the 11th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.Santa Cruz,CA,USA,2015:93-99.
[79] DIEGO P L,CRISTINA G R,DOMINIK J.Generating Diverse and Competitive Play-Styles for Strategy Games[EB/OL].(2021-04-17)[2021-05-06].http://arxiv.org/pdf/2104.08641.pdf.
[80] MAHLMANN T,DRACHEN A,TOGELIUS J,et al.Predicting Player Behavior in Tomb Raider:Underworld[C]//Proceedings of the IEEE Conference on Computational Intelligence and Games.Dublin,Ireland,2010:178-185.
[81] CHEN Z,EL-NASR M S,CANOSSA A,et al.Modeling In-dividual Differences through Frequent Pattern Mining on Role-Playing Game Actions[C]//Proceedings of the 11th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.AAAI Technical Report WS-15-23,Santa Cruz,CA,USA,2015:2-7.
[82] YANNAKAKIS G N,MARAGOUDAKIS M,HALLAM J.Preference Learning for Cognitive Modeling:A Case Study on Entertainment Preferences[J].IEEE Trans.Syst.Man Cybern.Part A Syst.Hum.,2009,39:1165-1175.
[83] LOCKHART E,BURCH N.Human-Agent Cooperation inBridge Bidding[EB/OL].(2020-11-28)[2021-05-06].http://arxiv.org/pdf/2011.14124.pdf.
[84] IZUMIGAWA C,LUCERO C,NANS L,et al.Building Human-Autonomy Teaming Aids for Real-Time Strategy Games[C]//International Conference on Human-Computer Interaction.Cham:Springer,2020.
[85] NAVIDI N,CHABOT F,KURANDWAD S,et al.Human and Multi-agent Collaboration in a Human-MARL Teaming Framework[EB/OL].(2020-06-01)[2021-06-06].http://arxiv.org/pdf/2006.07301.pdf.
[86] HU H,YARATS D,GONG Q,et al.Hierarchical DecisionMaking by Generating and Following Natural Language Instructions[C]//NeurIPS.2019.
[87] PARTALAS I,VRAKAS D,VLAHAVAS I.ReinforcementLearning and Automated Planning:A Survey[C]//Artificial Intelligence for Advanced Problem Solving Techniques.IGI Glo-bal:Hershey,PA,USA,2008:148-165.
[88] SCHRITTWIESER J,ANTONOGLOU I,HUBERT T,et al.Mastering Atari,Go,Chess and Shogi by Planning with a Lear-ned Model[J].Nature,2020,588(7839):604-609.
[89] S＇WIECHOWSKI M,GODLEWSKI K,SAWICKI B,et al.Monte Carlo Tree Search:A Review on Recent Modifications and Applications[EB/OL].(2021-03-08)[2021-06-06].http://arxiv.org/pdf/2103.04931.pdf.
[90] STANESCU M,BARRIGA N A,HESS A,et al.EvaluatingReal-time Strategy Game States using Convolutional Neural Networks[C]//IEEE Conference on Computational Intelligence and Games (CIG).2016.
[91] HUANG S,ONTAÑÓN S.Comparing Observation and Action Representations for Deep Reinforcement Learning in μRTS[J].arXiv:1910.12134,2019.
[92] BARRIGA N A,STANESCU M,BESOAIN F,et al.Improving RTS Game AI by Supervised Policy Learning,Tactical Search,and Deep Reinforcement Learning[J].IEEE Computational Intelligence Magazine,2019,14(3):11.
[93] HUANG S,ONTAÑÓN S,BAMFORD C,et al.Gym-μRTS Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning[EB/OL].(2021-05-21)[2021-06-26].http://arxiv.org/pdf/2105.13807.pdf.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey on Online Adversarial Planning for Real-time Strategy Game

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 4

Metrics

Comments

Recommended 0

[1]	MU Feng-jun, QIU Jing, CHEN Lu-feng, HUANG Rui, ZHOU Lin, YU Gong-jing. Optimization Method for Inter-frame Stability of Object Pose Estimation for Human-Machine Collaboration [J]. Computer Science, 2021, 48(11): 226-233.
[2]	LIU Tian-xing, LI Wei, XU Zheng, ZHANG Li-hua, QI Xiao-ya, GAN Zhong-xue. Monte Carlo Tree Search for High-dimensional Continuous Control Space [J]. Computer Science, 2021, 48(10): 30-36.
[3]	YANG Zhen, ZHANG Wan-peng, LIU Hong-fu, WEI Zhan-yang. Research on Multi-units Control Method in RTS Games Based on PAGA [J]. Computer Science, 2018, 45(11A): 101-104.
[4]	JI Hui and DING Ze-jun. Improvement of Monte Carlo Tree Search Algorithm in Two-person Game Problem [J]. Computer Science, 2018, 45(1): 140-143.