计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 220800179-5.doi: 10.11896/jsjkx.220800179

• 大数据&数据科学 • 上一篇    下一篇

一种消除位置偏差的搜索自动补全深度学习排序算法

周明星, 闫湘洲, 于敬, 高昌举, 陈运文, 纪达麒, 金克   

  1. 达观数据 上海 200120
  • 发布日期:2023-11-09
  • 通讯作者: 周明星(zhoumingxing@datagrand.com)
  • 基金资助:
    上海市科学技术委员会“科技创新行动计划”青年科技启明星计划资助项目(21QB1400100)

Unbiased Deep Learning to Rank Algorithm for Suggestion Auto-completion

ZHOU Mingxing, YAN Xiangzhou, YU Jing, GAO Changju, CHEN Yunwen, JI Daqi, JIN Ke   

  1. Datagrand Co.,Ltd.,Shanghai 200120,China
  • Published:2023-11-09
  • About author:ZHOU Mingxing,born in 1991,master,data scientist.His main research in-terests include learning to rank,deep learning,image processing,nature language processing,evolutionary algorithms,data mining,and complex networks.
  • Supported by:
    Shanghai Rising-Star Program(21QB1400100).

摘要: 搜索提示自动补全是正式提交搜索之前,影响用户输入搜索内容的关键手段之一,是商业搜索引擎不可或缺的核心功能之一。如何提供更好的提示词,是一个排序问题。在机器学习排序领域,收集的训练数据有位置偏差,且会影响训练模型的排序效果,已经是一个较为普遍的认知。针对以上训练数据有偏问题,对位置偏差和相关度使用深度学习分别建模,并结合改进后的上下文语义特征,新设计一种同时学习位置偏差和提示词相关度的深度学习排序算法(An Unbiased Deep Learning To Rank Algorithm for Suggestion Auto-completion,UDLTR-SAc)提升搜索提示自动补全的排序效果。UDLTR-SAc能自动学习训练数据中由于位置引入的偏差,从而学习到更为准确的相关度计算模型,在与没有考虑有偏问题的同类型算法及经典补全排序算法对比上分别获得显著增长;同时,在线上A/B测试上也获得+0.1%(p<0.1)的GMV增长。

关键词: 位置偏差, 深度学习, LTR, 提示词, 自动补全, 上下文语义

Abstract: Suggestion auto-completion is one of the key means to influence users’ input before searching submission,and it is one of the indispensable core functions of commercial search engines.How to provide better suggestion words is also a ranking pro-blem.In the field of machine learning ranking,it has been a common perception that the collected training data has position bias [1-8] which can affect the ranking effect of a training model.To address the above problem of biased training data,this paper combines improved context-based semantic feature to design an unbiased deep learning to ranking algorithm for suggestion auto-completion(UDLTR-SAc) which learns position bias and suggestion relevance simultaneously.According to offline experiments and online A/B tests,UDLTR-SAc can automatically learn the training data bias introduced by the position to obtain a more accurate model in calculating correlation when compared with the similar algorithm without considering the bias problem or the classical completion ranking algorithm respectively.What’s more,it also achieves a 0.1%(p< 0.1) increase in GMV on the online A/B tests.

Key words: Position bias, Deep learning, Learning to rank(LTR), Suggestion, Auto-completion, Context-based semantic

中图分类号: 

  • TP183
[1]AI Q,BI K,LUO C,et al.Unbiased learning to rank with unbia-sed propensity estimation[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:385-394.
[2]WANG X,BENDERSKY M,METZLER D,et al.Learning to rank with selection bias in personal search[C]//Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval.2016:115-124.
[3]WANG X,GOLBANDI N,BENDERSKY M,et al.Position bias estimation for unbiased learning to rank in personal search[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.2018:610-618.
[4]HU Z,WANG Y,PENG Q,et al.Unbiased lambdamart:an unbiased pairwise learning-to-rank algorithm[C]//The World Wide Web Conference.2019:2830-2836.
[5]YUE Y,JOACHIMS T.Interactively optimizing information retrieval systems as a dueling bandits problem[C]//Proceedings of the 26th Annual International Conference on Machine Lear-ning.2009:1201-1208.
[6]SCHUTH A,OOSTERHUIS H,WHITESON S,et al.Mul-tileave gradient descent for fast online learning to rank[C]//Proceedings of the Ninth ACM International Conference on Web Search and Data Mining.2016:457-466.
[7]WANG H,LANGLEY R,KIM S,et al.Efficient exploration of gradient space for online learning to rank[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:145-154.
[8]OOSTERHUIS H,DE RIJKE M.Differentiable unbiased online learning to rank[C]//Proceedings of the 27th ACM InternationalConference on Information and Knowledge Management.2018:1293-1302.
[9]YUAN K,KUANG D.Deep Pairwise Learning To Rank For Search Autocomplete[J].arXiv:2108.04976,2021.
[10]BAR-YOSSEF Z,KRAUS N.Context-sensitive query auto-completion[C]//Proceedings of the 20th International Conference on World Wide Web.2011:107-116.
[11]CHEN T,GUESTRIN C.Xgboost:A scalable tree boosting system[C]//Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining.2016:785-794.
[12]LI Z Y,YANG W,XIE Z J.A survey of PageRank research[J].Computer Science,2011,38(B10):185-188.
[13]WANG X P,LI Z Z.An improved ranking algorithmfor web page searching[J].Computer Science,2004,31(9A).
[14]LAI X X,HAN L X,ZENG X Q,et al.Research on meta-search ranking algorithms based on information quantity and information entropy[J].Computer Science,2012,39(3):153-156.
[15]WEN K M,LU Z D,SUN X L,et al.A survey of semanticsearch research[J].Computer Science,2008,35(5):1-4.
[16]ZHANG X,QU Y Z.Ranking problems in the Semantic Web[J].Computer Science,2008,35(2):5.
[17]CARPINETO C,ROMANO G.A survey of automatic query expansion in information retrieval[J].Acm Computing Surveys(CSUR),2012,44(1):1-50.
[18]LIU T Y.Learning to rank for information retrieval[J].Foundations and Trends© in Information Retrieval,2009,3(3):225-331.
[19]CAI F,DE RIJKE M.A survey of query auto completion in information retrieval[J].Foundations and Trends© in Information Retrieval,2016,10(4):273-363.
[20]JOACHIMS T,SWAMINATHAN A,SCHNABEL T.Unbi-ased learning-to-rank with biased feedback[C]//Proceedings of the Tenth ACM International Conference on Web Search and Data Mining.2017:781-789.
[21]GUO H,YU J,LIU Q,et al.PAL:a position-bias aware learning framework for CTR prediction in live recommender systems[C]//Proceedings of the 13th ACM Conference on Recommender Systems.2019:452-456.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!