Computer Science ›› 2022, Vol. 49 ›› Issue (6): 149-157.doi: 10.11896/jsjkx.210600226

• Database & Big Data & Data Science • Previous Articles     Next Articles

Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration

HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong   

  1. Command & Control Engineering College,Army Engineering University of PLA,Nanjing 210007,China
  • Received:2021-06-29 Revised:2021-10-16 Online:2022-06-15 Published:2022-06-08
  • About author:HONG Zhi-li,born in 1994,postgra-duate.His main research interests include deep reinforcement learning,reco-mmendation system and game confrontation.
    LAI Jun,born in 1979,postgraduate,associate professor,master supervisor.His main research interests include deep reinforcement learning and command information system engineering.

Abstract: In recent years,the application of deep reinforcement learning in recommendation system has attracted much attention.Based on the existing research,this paper proposes a new recommendation model RP-Dueling,which is based on the deep reinforcement learning Dueling-DQN algorithm,and adds the regret exploration mechanism to make the algorithm adaptively and dynamically adjust the proportion of “exploration-utilization” according to the training degree.The algorithm can capture users’ dynamic interest and fully explore the action space in the recommendation system with large-scale state space.By testing the proposed algorithm model on multiple data sets,the optimal average results of MAE and RMSE are 0.16 and 0.43 respectively,which are 0.48 and 0.56 higher than the current optimal research results.Experimental results show that the proposed model is superior to the existing traditional recommendation model and recommendation model based on deep reinforcement learning.

Key words: Deep reinforcement learning, Dueling-DQN, Dynamic interest, Recommendation system, Regret exploration, RP-Dueling

CLC Number: 

  • TP181
[1] JACOBI J A,BENSON E A,LINDEN G D.Recommendationsystem:U.S.Patent 7,908,183[P].[2011-3-15].
[2] SCHAFER J B,FRANKOWSKI D,HERLOCKER J,et al.Collaborative filtering recommender systems[M]//The Adaptive Web.Berlin:Springer Press,2007:291-324.
[3] DORSCH M,QIU Y,SOLER D,et al.PK1/EG-VEGF induces monocyte differentiation and activation[J].Journal of Leukocyte Biology,2005,78(2):426-434.
[4] QI H M,LIU Q,DAI D X.Personalized Friend Recommendation based on Interest Topics[J].Computer Engineering and Science,2018,40(2):348-353.
[5] SUTTON R S,BARTO A G.Reinforcement learning:An introduction[M].USA:MIT Press,2018.
[6] MOHRI M,ROSTAMIZADEH A,TALWALKAR A.Foundations of machine learning[M].USA:MIT Press,2018.
[7] JORDAN M I,MITCHELL T M.Machine learning:Trends,perspectives,and prospects[J].Science,2015,349(6245):255-260.
[8] MESSNER W,HOROWITZ R,KAO W W,et al.A new adaptive learning rule[C]//Proceedings of IEEE International Conference on Robotics and Automation.New York:IEEE Press,1990:1522-1527.
[9] KAELBLING L P,LITTMAN M L,MOORE A W.Reinforcement learning:A survey[J].Journal of Artificial Intelligence Research,1996,4(1):237-285.
[10] ROJANAVASU P,SRINIL P,PINNGERN O.New Recommendation System Using Reinforcement Learning[J].International Journal of the Computer,the Internet and Management,2005,13(3):23.
[11] ZHENG G,ZHANG F,ZHENG Z,et al.DRN:A deep reinforcement learning framework for news recommendation[C]//27th International World Wide Web(WWW 2018).Association for Computing Machinery,2018:167-176.
[12] LEI Y,WANG Z,LI W,et al.Social attentive deep q-network for recommendation[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.2019:1189-1192.
[13] ZHAO Z,CHEN X.Deep Reinforcement Learning based Reco-mmend System using stratified sampling[C]//IOP Conference Series:Materials Science and Engineering.IOP Publishing,2018.
[14] ZINKEVICH M,JOHANSON M,BOWLING M,et al.Regret minimization in games with incomplete information[J].Ad-vances in Neural Information Processing Systems,2007,20(14):1729-1736.
[15] YUAN F,HE X,KARATZOGLOU A,et al.Parameter-efficienttransfer from sequential behaviors for user modeling and recommendation[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.2020:1469-1478.
[16] BAGHER R C,HASSANPOUR H,MASHAYEKHI H.Usertrends modeling for a content-based recommender system[J].Expert Systems with Applications,2017,87:209-219.
[17] HUANG Z,SHAN G,CHENG J,et al.TRec:An efficientrecommendation system for hunting passengers with deep neural networks[J].Neural Computing and Applications,2019,31(1):209-222.
[18] HE X,HE Z,SONG J,et al.Nais:Neural attentive item simila-rity model for recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(12):2354-2366.
[19] PAZZANI M J,BILLSUS D.Content-based recommendationsystems[M]//The Adaptive Web.Berlin:Springer Press,2007:325-341.
[20] BREESE J S,HECKERMAN D,KADIE C.Empirical Analysis of Predictive Algorithms for Collaborative Filtering[J].Uncertainty in Artificial Intelligence,2013,98(7):43-52.
[21] LIN W,ALVAREZ S A,RUIZ C.Efficient Adaptive-Support Association Rule Mining for Recommender Systems[J].Data Mining & Knowledge Discovery,2002,6(1):83-105.
[22] YIN Y,FENG D,SHI S.A Utility based personalized article recommendation method[J].Journal of Computer Science,2017,40(12):2797-2811.
[23] VARTAK M,MADDEN S.CHIC:a combination-based recommendation system[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data.2013:981-984.
[24] FU M,QU H,YI Z,et al.A novel deep learning-based collaborative filtering model for recommendation system[J].IEEE transactions on cybernetics,2018,49(3):1084-1096.
[25] LI C,QUAN C,PENG L,et al.A capsule network for recommendation and explaining what you like and dislike[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.2019:275-284.
[26] GABRIEL DE SOUZA P M,JANNACH D,DA CUNHA A M.Contextual hybrid session-based news recommendation with recurrent neural networks[J].IEEE Access,2019,7:169185-169203.
[27] CHEN X,LI S,LI H,et al.Generative adversarial user model for reinforcement learning based recommendation system[C]//International Conference on Machine Learning.PMLR,2019:1052-1061.
[28] XIAO Y,XIAO L,LU X Z,et al.Deep Reinforcement Learning-Based User Profile Perturbation for Privacy Aware Recommendation[J].IEEE Internet of Things Journal,2020,8(6):4560-4568.
[29] ZHANG Y Y,SU X Y,LIU Y.A Novel Movie Recommenda-tion System Based on Deep Reinforcement Learning with Prio-ritized Experience Replay[C]//2019 IEEE 19th International Conference on Communication Technology (ICCT).New York:IEEE,2019:1496-1500.
[30] WATKINS C J C H,DAYAN P.Q-learning[J].Machine lear-ning,1992,8(3/4):279-292.
[31] PETERS J,SCHAAL S.Natural Actor-Critic[J].Neurocompu-ting,2008,71(7/8/9):1180-1190.
[32] WANG Z,SCHAUL T,HESSEL M,et al.Dueling network architectures for deep reinforcement learning[C]//International Conference on Machine Learning.PMLR,2016:1995-2003.
[33] FAN J,WANG Z,XIE Y,et al.A theoretical analysis of deep Q-learning[C]//Learning for Dynamics and Control.PMLR,2020:486-489.
[34] XIANG L.Recommended system practice[M].Beijing:Posts & Telecom Press.2012.
[35] HERLOCKER J L,KONSTAN J A,TERVEEN L G,et al.Evaluating collaborative filtering recommender systems[J].ACM Transactions onInformation Systems(TOIS),2004,22(1):5-53.
[36] COLLINS A,TKACZYK D,BEEL J.A Novel Approach toRecommendation Algorithm Selection using Meta-Learning[C]//AICS.2018:210-219.
[37] YANG K X,LI Y W.Development and Design of mobile Intelligent Learning Platform based on Collaborative Filtering Algorithm[J].Software Engineering and Applications,2019,8(3):104-114.
[38] AHARON M,ELAD M,BRUCKSTEIN A.K-SVD:An algo-rithm for designing overcomplete dictionaries for sparse representation[J].IEEE Transactions on Signal Processing,2006,54(11):4311-4322.
[39] KOREN Y.Factorization meets the neighborhood:a multiface-ted collaborative filtering model[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2008:426-434.
[40] WANG X,YANG H,LIM K.Privacy-preserving POI recommendation using nonnegative matrix factorization[C]//2018 IEEE Symposium on Privacy-aware Computing(PAC).New York:IEEE,2018:117-118.
[41] BARRON E N,ISHII H.The Bellman equation for minimizing the maximum cost[J].Nonlinear Analysis:Theory,Methods & Applications,1989,13(9):1067-1090.
[42] AMIT R,MEIR R,CIOSEK K.Discount factor as a regularizer in reinforcement learning[C]//International Conference on Machine Learning.PMLR,2020:269-278.
[1] QIN Qi-qi, ZHANG Yue-qin, WANG Run-ze, ZHANG Ze-hua. Hierarchical Granulation Recommendation Method Based on Knowledge Graph [J]. Computer Science, 2022, 49(8): 64-69.
[2] FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77.
[3] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[4] LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[5] SHUAI Jian-bo, WANG Jin-ce, HUANG Fei-hu, PENG Jian. Click-Through Rate Prediction Model Based on Neural Architecture Search [J]. Computer Science, 2022, 49(7): 10-17.
[6] QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[7] CAI Xiao-juan, TAN Wen-an. Improved Collaborative Filtering Algorithm Combining Similarity and Trust [J]. Computer Science, 2022, 49(6A): 238-241.
[8] HE Yi-chen, MAO Yi-jun, XIE Xian-fen, GU Wan-rong. Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation [J]. Computer Science, 2022, 49(6A): 272-279.
[9] XIONG Zhong-min, SHU Gui-wen, GUO Huai-yu. Graph Neural Network Recommendation Model Integrating User Preferences [J]. Computer Science, 2022, 49(6): 165-171.
[10] XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[11] YU Ai-xin, FENG Xiu-fang, SUN Jing-yu. Social Trust Recommendation Algorithm Combining Item Similarity [J]. Computer Science, 2022, 49(5): 144-151.
[12] LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[13] OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[14] CHEN Jin-peng, HU Ha-lei, ZHANG Fan, CAO Yuan, SUN Peng-fei. Convolutional Sequential Recommendation with Temporal Feature and User Preference [J]. Computer Science, 2022, 49(1): 115-120.
[15] DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243.
Full text



No Suggested Reading articles found!