一种结合自编码器与强化学习的查询推荐方法

doi:10.11896/jsjkx.200900196

摘要/Abstract

摘要： 查询推荐的目的是发掘搜索引擎用户的查询意图,并给出相关查询推荐。传统的查询推荐方法主要依靠人工提取查询的相关特征,如查询频率、查询时间、用户点击次数和停留时间等,并使用统计学习算法或排序算法给出查询推荐。近年来,深度学习方法在查询推荐问题上获得了广泛应用。现有的用于查询推荐的深度学习方法大多是基于循环神经网络,通过对查询日志中所有查询的语义特征进行建模以预测用户的下一查询。但是,现有的深度学习方法生成的查询推荐上下文感知能力较差,难以准确捕捉用户查询意图,且未充分考虑时间因素对查询推荐的影响,缺乏时效性和多样性。针对上述问题,文中提出了一种结合自编码器与强化学习的查询推荐模型 (Latent Variable Hierarchical Recurrent Encoder-Decoder with Time Information of Query and Reinforcement Learning,VHREDT-RL)。VHREDT-RL引入了强化学习联合训练生成器和判别器,从而增强了生成查询推荐的上下文感知能力;利用融合查询时间信息的隐变量分层递归自编码器作为生成器,使得生成查询推荐有更好的时效性和多样性。AOL数据集上的实验结果表明,文中提出的VHREDT-RL模型获得了优于基准方法的精度、鲁棒性和稳定性。

Abstract: The purpose of query suggestion is to explore the query intent of search engine users and provide relevant query suggestion.Traditional query suggestion methods mainly rely on manually extracting relevant features of queries,such as query frequency,query time,user clicks and dwell time,etc.,and use statistical learning algorithms or ranking algorithms to give query suggestion.In recent years,deep learning methods have been widely used in query suggestion problems.The existing deep learning methods for query recommendation are mostly based on recurrent neural networks,which predict the next query of the user by modeling the semantic features of all queries in the query log.However,the existing deep learning methods have poor context awareness of query suggestion,it is difficult to accurately capture user query intentions,and the influence of time factors on query suggestion is not fully considered,and it lacks timeliness and diversity.In response to the above problems,this paper proposes a query suggestion model combining autoencoder and reinforcement learning(Latent Variable Hierarchical Recurrent Encoder-Decoder with Time Information of Query and Reinforcement Learning,VHREDT-RL).VHREDT-RL introduces a reinforcement learning joint training generator and discriminator,thereby enhancing the context awareness of generating query suggestion,using latent variable hierarchical recursive autoencoders that integrate query time information as a generator,and making query suggestion better time-sensitive and diversity.The experimental results on the AOL data set show that the VHREDT-RL model proposed in this paper achieves better accuracy,robustness and stability than the benchmark method.

Key words: Query intention, Query suggestion, Reinforcement learning, Time information, Variable hierarchical recursive autoencoder

中图分类号:

TP391

胡潇炜, 陈羽中. 一种结合自编码器与强化学习的查询推荐方法[J]. 计算机科学, 2021, 48(6A): 206-212. https://doi.org/10.11896/jsjkx.200900196

HU Xiao-wei, CHEN Yu-zhong. Query Suggestion Method Based on Autoencoder and Reinforcement Learning[J]. Computer Science, 2021, 48(6A): 206-212. https://doi.org/10.11896/jsjkx.200900196

参考文献

[1] HUANG C K,CHIEN L F,OYANG Y J.Relevant term suggestion in interactive web search based on contextual information in query session logs[J].Journal of the American Society for Information Science and Technology,2003,54(7):638-649.
[2] CHEN W,CAI F,CHEN H,et al.Personalized query suggestion diversification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:817-820.
[3] CHO K,VAN MERRINBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[4] OZERTEM U,CHAPELLE O,DONMEZ P,et al.Learning to suggest:a machine learning framework for ranking query suggestions[C]//Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval.2012:25-34.
[5] SANTOS R L T,MACDONALD C,OUNIS I.Learning to rank query suggestions for adhoc and diversity search[J].Information Retrieval,2013,16(4):429-451.
[6] KOMBRINK S,MIKOLOV T,KARAFIÁT M,et al.Recurrent neural network based language modeling in meeting recognition[C]//Twelfth Annual Conference of the International Speech Communication Association.2011.
[7] MIKOLOV T,KOMBRINK S,BURGET L,et al.Extensions of recurrent neural network language model[C]//2011 IEEE International Conference on Acoustics,Speech and Signal Proces-sing (ICASSP).IEEE,2011:5528-5531.
[8] SUTSKEVER I,MARTENS J,HINTON G E.Generating text with recurrent neural networks[C]//ICML.2011.
[9] SHUANG C,YI D.Structure of recurrent neural networks[J].Computer Applications,2004,24(8):18-20.
[10] YUAN L,MENG Y.Combination prediction model of network traffic based on recurrent neural networks[J].Computer Engineering and Design,2008(3):700-702.
[11] JIAN Z,DAN Q,ZHEN L.Recurrent neural network language model based on work vector features[J].Pattern Recognition and Artificial Intelligence,2015,28(4):299-305.
[12] ONAL K D,ZHANG Y,ALTINGOVDE I S,et al.Neural information retrieval:At the end of the early years[J].Information Retrieval Journal,2018,21(2/3):111-182.
[13] SERBAN I V,SORDONI A,BENGIO Y,et al.Building end-to-end dialogue systems using generative hierarchical neural network models[J].arXiv:1507.04808,2015.
[14] CAI F,DE RIJKE M.A Survey of Query Auto Completion in Information Retrieval[J].Foundations and Trends in Information Retrieval,2016,10(4):273-363.
[15] SERBAN I V,SORDONI A,LOWE R,et al.A hierarchical latent variable encoder-decoder model for generating dialogues[J].arXiv:1605.06069,2016.
[16] BAEZA-YATES R,HURTADO C,MENDOZA M.Query recommendation using query logs in search engines[C]//International Conference on Extending Database Technology.Berlin:Springer,2004:588-596.
[17] CAO H,JIANG D,PEI J,et al.Context-aware query suggestion by mining click-through and session data[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:875-883.
[18] BARAGLIA R,NARDINI F M,CASTILLO C,et al.The effects of time on query flow graph-based models for query suggestion[M]//Adaptivity,Personalization and Fusion of Heterogeneous Information.2010:182-189.
[19] WANG J G,HUANG J Z,GUO J,et al.Recommending high-utility search engine queries via a query-recommending model[J].Neurocomputing,2015,167:195-208.
[20] CHEN W,HAO Z,SHAO T,et al.Personalized query suggestion based on user behavior[J].International Journal of Modern Physics C,2018,29(4):1850036.
[21] SORDONI A,BENGIO Y,VAHABI H,et al.A hierarchical recurrent encoder-decoder for generative context-aware query suggestion[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management.2015:553-562.
[22] SONG J,XIAO J,WU F,et al.Hierarchical contextual attention recurrent neural network for map query suggestion[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(9):1888-1901.
[23] CHEN W,CAI F,CHEN H,et al.Attention-based hierarchical neural query suggestion[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:1093-1096.
[24] WU B,XIONG C,SUN M,et al.Query suggestion with feedback memory network[C]//Proceedings of the 2018 World Wide Web Conference.2018:1563-1571.
[25] JIANG J Y,WANG W.RIN:reformulation inference network for context-aware query suggestion[C]//Proceedings of the 27th ACM International Conference on Information and Know-ledge Management.2018:197-206.
[26] BENGIO Y.Learning deep architectures for AI[M].Now Publishers Inc,2009.
[27] LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[J].arXiv:1509.02971,2015.
[28] SILVER D,HUANG A,MADDISON C J,et al.Mastering the game of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489.
[29] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atariwith deep reinforcement learning[J].arXiv:1312.5602,2013.
[30] YU L,ZHANG W,WANG J,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Thirty-first AAAI Conference on Artificial Intelligence.2017.
[31] LI J,MONROE W,SHI T,et al.Adversarial learning for neural dialogue generation[J].arXiv:1701.06547,2017.
[32] SUTTON R S,BARTO A G.Reinforcement learning:An introduction[M].MIT press,2018.

相关文章 15

[1]	刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148
[2]	徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[3]	熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[4]	史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100
[5]	袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174
[6]	于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219
[7]	李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[8]	郭雨欣, 陈秀宏. 融合BERT词嵌入表示和主题信息增强的自动摘要模型 Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement 计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101
[9]	范静宇, 刘全. 基于随机加权三重Q学习的异策略最大熵强化学习算法 Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning 计算机科学, 2022, 49(6): 335-341. https://doi.org/10.11896/jsjkx.210300081
[10]	谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[11]	洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226
[12]	徐化池, 史殿习, 崔玉宁, 景罗希, 刘聪. 面向事件相机的时间信息融合网络框架 Time Information Integration Network for Event Cameras 计算机科学, 2022, 49(5): 43-49. https://doi.org/10.11896/jsjkx.210400047
[13]	张佳能, 李辉, 吴昊霖, 王壮. 一种平衡探索和利用的优先经验回放方法 Exploration and Exploitation Balanced Experience Replay 计算机科学, 2022, 49(5): 179-185. https://doi.org/10.11896/jsjkx.210300084
[14]	李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155
[15]	欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed