Computer Science ›› 2026, Vol. 53 ›› Issue (4): 366-376.doi: 10.11896/jsjkx.250700198
• Artificial Intelligence • Previous Articles Next Articles
PAN Jiahao, FENG Xiang, YU Huiqun
CLC Number:
| [1]SILVER D,HUANG A,MADDISON C J,et al.Mastering the game of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489. [2]SILVER D,SCHRITTWIESER J,SIMONYAN K,et al.Mastering the game of go without human knowledge[J].Nature,2017,550(7676):354-359. [3]SAUNDERS W,SASTRY G,STUHLMUELLER A,et al.Trial without error:Towards safe reinforcement learning via human intervention[J].arXiv:1707.05173,2017. [4]PENG Z,LI Q,LIU C,et al.Safe driving via expert guided policy optimization[C]//Conference on Robot Learning.PMLR,2022:1554-1563. [5]LILLICRAP T P.Continuous control with deep reinforcement learning[J].arXiv:1509.02971,2015. [6]GU S,HOLLY E,LILLICRAP T,et al.Deep reinforcementlearning for robotic manipulation with asynchronous off-policy updates[C]//2017 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2017:3389-3396. [7]YU H,LIANG Y,ZHANG,et al.Terrain-Adaptive Imitation Learning Method Based on Multi-Task Reinforcement Learning[J].Journal of Data Acquisition and Processing,2024,39(5):1182-1191. [8]YU T,QUILLEN D,HE Z,et al.Meta-world:A benchmark and evaluation for multi-task and meta reinforcement learning[C]//Conference on Robot Learning.PMLR,2020:1094-1100. [9]LUO Y T,XUE Z C.Multi-Task Assisted Driving StrategyLearning Method for Autonomous Driving[J].Journal of South China University of Technology(Natural Science Edition),2024,52(10):31-40. [10]ISHIHARA K,KANERVISTO A,MIURA J,et al.Multi-task learning with attention for end-to-end autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:2902-2911. [11]PEREZ-LIEBANA D,LIU J,KHALIFA A,et al.General video game ai:A multitrack framework for evaluating agents,games,and content generation algorithms[J].IEEE Transactions on Games,2019,11(3):195-214. [12]ZHANG J,GUO B,DING X,et al.An adaptive multi-objective multi-task scheduling methodby hierarchical deep reinforcement learning[J].Applied Soft Computing,2024,154:111342. [13]LIU W,TANG X,ZHAO C.Distractor-aware tracking withmulti-task and dynamic feature learning[J].Journal of Circuits,Systems and Computers,2021,30(2):2150031. [14]RUSU A A,RABINOWITZ N C,DESJARDINS G,et al.Progressive neural networks[J].arXiv:1606.04671,2016. [15]ALJUNDI R,CHAKRAVARTY P,TUYTELAARS T.Expert gate:Lifelong learning with a network of experts[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:3366-3375. [16]SHEN S,HOU L,ZHOU Y,et al.Mixture-of-experts meets instruction tuning:A winning combination for large language models[J].arXiv:2305.14705,2023. [17]ROSENBAUM C,KLINGER T,RIEMER M.Routing net-works:Adaptive selection of non-linear functions for multi-task learning[J].arXiv:1711.01239,2017. [18]YANG R,XU H,WU Y,et al.Multi-task reinforcement lear-ning with soft modularization[J].Advances in Neural Information Processing Systems,2020,33:4767-4777. [19]HINTON G,VINYALS O,DEAN J.Distilling the knowledge ina neural network[J].arXiv:1503.02531,2015. [20]KUMARAN D,HASSABIS D,MCCLELLAND J L.Whatlearning systems do intelligent agents need? Complementary learning systems theory updated[J].Trends in Cognitive Scien-ces,2016,20(7):512-534. |
| [1] | CHEN Hongxiu, ZENG Xia, LIU Zhiming, ZHAO Hengjun. Automatic Theorem Proving Based on Pre-trained Language Models and Unification [J]. Computer Science, 2026, 53(4): 40-47. |
| [2] | WU Yansheng, CAO Xinyi, FAN Weibei. Research on Efficient Construction of Plateaued Functions Based on DQN-enhanced Genetic Algorithm [J]. Computer Science, 2026, 53(4): 57-65. |
| [3] | HUANG Beibei, LIU Jinfeng. Causal Disentangled Representation Learning with Integrated Sparse Coding [J]. Computer Science, 2026, 53(4): 66-77. |
| [4] | ZHANG Xueqin, WANG Zhineng, LI Jinsheng, LU Yisong, LUO Fei. Key Node Identification in Temporal Social Networks Based on Deep Learning and Multi-feature Fusion [J]. Computer Science, 2026, 53(4): 143-154. |
| [5] | QIN Haiqi, MI Jusheng. Concept-cognitive Learning and Incremental Learning in Complex Networks [J]. Computer Science, 2026, 53(4): 208-214. |
| [6] | LIU Jiaqi, WANG Yujie, XIANG Guodu, YU Kui, CAO Fuyuan. Long-term Causal Effect Estimation Based on Deep Reinforcement Learning [J]. Computer Science, 2026, 53(4): 235-244. |
| [7] | HUA Yu, ZHOU Xiaocheng, SHEN Xiangjun, LIU Zhifeng, ZHOU Conghua. Phase-preserved MinMax Framework for Graph Augmentation in Frequency Domain [J]. Computer Science, 2026, 53(4): 245-251. |
| [8] | GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157. |
| [9] | WANG Yiming, JIAO Min, ZHAO Suyun, CHEN Hong, LI Cuiping. Prompt-conditioned Representation Learning with Diffusion Models for Semi-supervised Clustering [J]. Computer Science, 2026, 53(3): 158-165. |
| [10] | ZHAO Binbei, ZHU Li, ZHAO Hongli, LI Yutong. Computer Vision Applications in Rail Transit Systems [J]. Computer Science, 2026, 53(3): 214-224. |
| [11] | JIA Shuheng, FU Huimin. Optimizing Probabilistic Choice for Solving SAT Problems [J]. Computer Science, 2026, 53(3): 366-374. |
| [12] | HUANG Miaomiao, WANG Huiying, WANG Meixia, WANG Yejiang , ZHAO Yuhai. Review of Graph Embedding Learning Research:From Simple Graph to Complex Graph [J]. Computer Science, 2026, 53(1): 58-76. |
| [13] | WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network [J]. Computer Science, 2026, 53(1): 231-240. |
| [14] | DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni. Collaborative Semantics Fusion for Multi-agent Behavior Decision-making [J]. Computer Science, 2026, 53(1): 252-261. |
| [15] | ZENG Dan, HE Xingxing, LI Yingfang, LI Tianrui. Structures of Multi-line Standard Contradictions in First-order Logic [J]. Computer Science, 2025, 52(12): 200-208. |
|
||