Computer Science ›› 2021, Vol. 48 ›› Issue (12): 278-285.doi: 10.11896/jsjkx.210900250

• Artificial Intelligence • Previous Articles     Next Articles

Survey on Retrieval-based Chatbots

WU Yu1, LI Zhou-jun2   

  1. 1 Natural Language Computing Group,Microsoft Research Asia,Beijing 100080,China
    2 School of Computer Science and Engineering,Beihang University,Beijing 100191,China
  • Received:2020-03-20 Revised:2020-12-20 Online:2021-12-15 Published:2021-11-26
  • About author:WU Yu,born in 1992,Ph.D,senior researcher.His main research interests include natural language processing and spoken language processing.
    LI Zhou-jun,born in 1963,Ph.D,professor,is a member of China Computer Federation.His main research interests include data mining,natural language processing,network and information security.
  • Supported by:
    National Natural Science Foundation of China(U1636211,61672081) and Fund of the State Key Laboratory of Software Development Environment(SKLSDE-2021ZX-18).

Abstract: With the rapid progress of natural language processing techniques and the massive accessible conversational data on Internet,non-tasked oriented dialogue systems,also referred to as Chatbots,have achieved great success,and drawn attention from both academia and industry.Currently,there are two lines in chatbots research,retrieval-based chatbots and generation-based chatbots.Due to the fluent responses and low latency,retrieval-based chatbots is a common method in practice.This paperfirst briefly introduces the research background, basic structure and component modules of retrieval-based chatbots,and then illustrates the constraints of the response selection module and related data set in details.Subsequently,we summarize recent popular techniques for response selection problem,including:statistic method,representation-based neural network method,interaction-based neural network method,and pre-training-based method.Finally,we pose the challenges of chatbots and outline promising directions as future work.

Key words: Chatbot, Natural language processing, Pre-training technology, Response selection, Text matching

CLC Number: 

  • TP391
[1]WEIZENBAUM J.ELIZA:a computer program for the study of natural language communication between man and machine[J].Communications of the ACM,1966,9(1):36-45.
[2]RITTER A,CHERRY C,DOLAN W B.Data-driven response generation in social media[C]//Proceedings of the 2011 Confe-rence on Empirical Methods in Natural Language Processing.2011:583-593.
[3]JI Z,LU Z,LI H.An Information Retrieval Approach to Short Text Conversation[J].arXiv:1408.6988,2014.
[4]VINYALS O,LE Q.A neural conversational model[J].arXiv:1506.05869,2015.
[5]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112.
[6]ZHOU L,GAO J,LI D,et al.The design and implementation of xiaoice,an empathetic social chatbot.[J].Computational Linguistics,2020,46(1):53-93.
[7]LOWE R,POW N,SERBAN I,et al.The Ubuntu Dialogue Corpus:A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems[C]//Proceedings of the SIGDIAL 2015 Conference,The 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Prague,Czech Republic,2015:285-294.
[8]WANG H,LU Z,LI H,et al.A Dataset for Research on Short-Text Conversations[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.EMNLP,2013:935-945.
[9]WU Y,WU W,XING C,et al.Sequential Match Network:A New Architecture for Multi-turn Response Selection in Re-trieval-based Chatbots[C]//Proceedings of the 55th Annual Mee-ting of the Association for Computational Linguistics.2017:496-505.
[10]ZHANG Z,LI J,ZHU P,ZHAO H,LIU G.Modeling Multi-turn Conversation with Deep Utterance Aggregation[C]//Proceedings of the 27th International Conference on Computational Linguistics 2018:3740-3752.
[11]ZHANG S,DINAN E,URBANEK J,et al.Personalizing Dialogue Agents:I have a dog,do you have pets too?[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.ACL,2018:2204-2213.
[12]WELLECK S,WESTON J,SZLAM A,et al.Dialogue natural language inference [C]//Proceedings of the 57th Annual Mee-ting of the Association for Computational Linguistics.ACL,2019:3731-3741.
[13]BURGES C,SHAKED T,RENSHAW E,et al. Learning to rank using gradient descent[C]//Proceedings of the 22nd International Conference on Machine Learning (ICML-05):89-96.
[14]SPÄRCK J K.A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J].Journal of Documentation,1972;28(1):11-21.
[15]BROWN P F,DELLA S A,DELLA P V J,et al.The mathema- tics of statistical machine translation:Parameter estimation [J].Computational Linguistics,1993;19(2):263-311.
[16]ZHAO X,JIANG J,WENG J,et al.Comparing twitter and traditional media using topic models[C]//European Conference on Information Retrieval.2011:338-349.
[17]WAGNER R A,FISCHER M J.The string-to-string correction problem[J].Journal of the ACM (JACM),1974,21(1):168-173.
[18]MACKAY D J.Information theory,inference and learning algorithms [M].Cambridge University Press,2003.
[19]BAEZA-YATES R,RIBEIRO-NETO B,et al.Modern information retrieval[M]//volume 463.ACM press,New York,1999.
[20]CHOI J,YOO K,LEE S.Learning to compose task-specific tree structures[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018:248-258.
[21]LIU X,KEVIN D,GAO J.Stochastic answer networks for natural language inference[J].arXiv:1804.07888,2018.
[22]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM.A Convolutional Neural Network for Modelling Sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.2014:655-665.
[23]ZHOU X,DONG D,WU H,et al.Multi-view Response Selection for Human-Computer Conversation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:372-381.
[24]YAN R,SONG Y,WU H.Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System[C]//Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval.2016:55-64.
[25]QIU X,HUANG X.Convolutional neural tensor network architecture for community-based question answering[C]//Procee-dings of the 24th International Conference on Artificial Intelligence.2015:1305-1311.
[26]WU Y,WU W,XING C,et al.A sequential matching framework for multi-turn response selection in retrieval-based chatbots [J].Computational Linguistics,2019,45(1),163-197.
[27]ZHOU X,LI L,DONG D,et al.Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2018:1118-1127.
[28]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[29]YIN W,SCHÜTZE H,XIANG B,et al.Abcnn:Attention-based convolutional neural network for modeling sentence pairs[J].Transactions of the Association for Computational Linguistics,2016(4):259-272.
[30]PANG L,LAN Y,GUO J,et al.Text matching as image recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2016:2793-2799.
[31]CHEN Q,ZHU X,LING Z H,et al.Enhanced LSTM for Natural Language Inference[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2017:1657-1668.
[32]SONG S,WANG C,PU X,et al.An Enhanced Convolutional Inference Model with Distillation for Retrieval-Based QA[C]//DASFAA.2021:511-515.
[33]PETERS M,NEUMANN M,IYYER M,et al.Deep contextua- lized word representations[C]//Proceedings of NAACL-HLT.2018:2227-2237.
[34]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.2013:3111-3119.
[35]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[OL].https://s3-us-west-2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/language understanding paper.pdf,2018.
[36]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding [C]//Proceedings of NAACL-HLT.2019:4171-4186.
[37]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti- mized BERT pretraining approach[J].arXiv:1907.11692,2019.
[38]YANG Z,DAI Z,YANG Y,et al.Xlnet:Generalized autoregressive pretraining for language understanding[C]//Advances in Neural Information Processing Systems.2019:32-42.
[39]ZHANG Z,HAN X,LIU Z,et al.ERNIE:Enhanced Language Representation with Informative Entities[J].arXiv:1905.07129,2019.
[40]WHANG T,LEE D,LEE C,et al.Domain Adaptive Training BERT for Response Selection[J].arXiv:1908.04812.
[41]TALMOR A,HERZIG J,LOURIE N,et al.Commonsense QA:A Question Answering Challenge Targeting Commonsense Knowledge[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4149-4158.
[42]DUA D,WANG Y,DASIGI P,et al.DROP:A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs[C]//Proceedings of North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:2368-2378.
[43]ZHOU K,ZHANG K,WU Y,et al.Unsupervised Context Rewriting for Open Domain Conversation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2017:1834-1844.
[44]YU J,QIU M,JIANG J,et al.Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.2018:682-690.
[1] LYU Xiao-feng, ZHAO Shu-liang, GAO Heng-da, WU Yong-liang, ZHANG Bao-qi. Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network [J]. Computer Science, 2022, 49(9): 92-100.
[2] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[3] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[4] LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[5] ZHANG Hu, BAI Ping. Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification [J]. Computer Science, 2022, 49(2): 279-284.
[6] CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[7] WANG Li-mei, ZHU Xu-guang, WANG De-jia, ZHANG Yong, XING Chun-xiao. Study on Judicial Data Classification Method Based on Natural Language Processing Technologies [J]. Computer Science, 2021, 48(8): 80-85.
[8] LYU Le-bin, LIU Qun, PENG Lu, DENG Wei-bin , WANG Chong-yu. Text Matching Fusion Model Combining Multi-granularity Information [J]. Computer Science, 2021, 48(6): 196-201.
[9] TONG Xin, WANG Bin-jun, WANG Run-zheng, PAN Xiao-qin. Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing [J]. Computer Science, 2021, 48(1): 258-267.
[10] TIAN Ye, SHOU Li-dan, CHEN Ke, LUO Xin-yuan, CHEN Gang. Natural Language Interface for Databases with Content-based Table Column Embeddings [J]. Computer Science, 2020, 47(9): 60-66.
[11] LU Long-long, CHEN Tong, PAN Min-xue, ZHANG Tian. CodeSearcher:Code Query Using Functional Descriptions in Natural Languages [J]. Computer Science, 2020, 47(9): 1-9.
[12] ZHANG Hao-yang and ZHOU Liang. Application of Improved GHSOM Algorithm in Civil Aviation Regulation Knowledge Map Construction [J]. Computer Science, 2020, 47(6A): 429-435.
[13] ZHANG Ying, ZHANG Yi-fei, WANG Zhong-qing and WANG Hong-ling. Automatic Summarization Method Based on Primary and Secondary Relation Feature [J]. Computer Science, 2020, 47(6A): 6-11.
[14] WU Xiao-kun, ZHAO Tian-fang. Application of Natural Language Processing in Social Communication:A Review and Future Perspectives [J]. Computer Science, 2020, 47(6): 184-193.
[15] HU Chao-wen, YANG Ya-lian, WU Chang-xing. Survey of Implicit Discourse Relation Recognition Based on Deep Learning [J]. Computer Science, 2020, 47(4): 157-163.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!