计算机科学 ›› 2021, Vol. 48 ›› Issue (8): 234-239.doi: 10.11896/jsjkx.200700162

所属专题: 自然语言处理 虚拟专题

• 人工智能 • 上一篇    下一篇

融合检索与生成的复合对话模型

杨慧敏, 马廷淮   

  1. 南京信息工程大学计算机与软件学院 南京 210044
  • 收稿日期:2020-07-26 修回日期:2020-09-17 发布日期:2021-08-10
  • 通讯作者: 马廷淮(thma@nuist.edu.cn)
  • 基金资助:
    国家自然科学基金(U1736105)

Compound Conversation Model Combining Retrieval and Generation

YANG Hui-min, MA Ting-huai   

  1. College of Computer and Software,Nanjing University of Information Science & Technology,Nanjing 210044,China
  • Received:2020-07-26 Revised:2020-09-17 Published:2021-08-10
  • About author:YANG Hui-min,born in 1997,postgra-duate.Her main research interests include data mining and data sharing.(2432640905@qq.com)MA Ting-huai,born in 1974,Ph.D,professor,is a member of China Computer Federation.His main research interests include data mining,data sharing and privacy protection.
  • Supported by:
    National Natural Science Foundation of China(U1736105).

摘要: 对话模型是自然语言处理的重要方向之一。现如今的对话模型主要分为基于检索的方式和基于生成的方式。然而,检索方式无法回应语料库中未出现的问句,而生成方式容易出现安全回复的问题。鉴于此,提出融合检索与生成的复合对话模型,通过将检索方式与生成方式相结合来弥补各自的缺点。首先通过检索模块得到K个检索上下文以及所对应的K个检索候选回应。在多回应生成模块中进一步结合检索上下文得到若干生成候选回应。最后的候选回应排序模块分为预筛选与后排序两个步骤。预筛选部分通过计算输入问题与候选回应的相似度得到最优检索回应与最优生成回应,后排序部分进一步选出对于输入问题最合适的回答。实验结果显示,相对于传统模型,复合对话模型在BLUE指标上提升了6%,在多样性指标上提升了12%。

关键词: Transformer, 对话系统, 后排序, 检索模型, 生成模型

Abstract: Conversation model is one of the important directions of natural language processing.Today's dialogue models are mainly divided into retrieval-based methods and generation-based methods.However,the retrieval method cannot respond to questions that do not appear in the corpus,and the generation method is prone to problems with safe responses.In view of this,a compound conversation model that combines retrieval and generation is proposed,and the retrieval method and generation method are combined to make up for their shortcomings.First,K retrieval contexts and corresponding K retrieval candidate responses are obtained through the retrieval module.In the multi-response generation module,retrieval contexts are further combined to obtain several generation candidate responses.The candidate response ranking module is divided into two steps:pre-screening and post-reranking.The pre-screening part obtains the optimal retrieval response and the optimal generated response by calculating the similarity between the input question and candidate responses,and the post-reranking part further selects the most suitable answer to the input question.Experimental results show that the BLUE index increased by 6%,and the diversity index increased by 12%.

Key words: Conversation system, Generation model, Post-reranking, Retrieval model, Transformer

中图分类号: 

  • TP319.1
[1]WANG Y,HE Q T.Research on Intelligent Question Answe-ring System[J].Electronic Technology and Software Enginee-ring,2019(5):174-175.
[2]VINYALS O,LE Q.A neural conversational model[J].arXiv:1506.05869,2015.
[3]SHEN Y,HE X,GAO J,et al.A latent semantic model with convolutional-pooling structure for information retrieval[C]//Proceedings of the 23rd ACM International Conference on Information and Knowledge Management.Shanghai,China:ACM,2014:101-110.
[4]WAN S,LAN Y,XU J,et al.Match-srnn:Modeling the recursive matching structure with spatial rnn[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.New York,USA:Margan Kaufmann,2016:2922-2928.
[5]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Advances in Neural Information Processing Systems.Montreal,Quebec,Canada:MIT PRESS,2014:3104-3112.
[6]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//3rd International Conference on Learning Representations.San Diego,USA:ICLR,2015:1-9.
[7]ZHAO Y Y,WANG Z Y,WANG P,et al.A review of task-based dialogue systems[J].Chinese Journal of Computers,2020,43(10),1862-1896.
[8]HORI T,WANG W,KOJI Y,et al.Adversarial training and decoding strategies for end-to-end neural conversation models[J].Computer Speech & Language,2019,54:122-139.
[9]BROMLEY J,GUYON I,LECUN Y,et al.Signature verification using a “siamese” time delay neural network[C]//Advances in Neural Information Processing Systems.1994:737-744.
[10]CHI Z,ZHANG B.A sentence similarity estimation methodbased on improved siamese network[J].Journal of Intelligent Learning Systems and Applications,2018,10(4):121-134.
[11]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.Long Beach,USA:MIT PRESS,2017:5998-6008.
[12]ZHU Z,LIANG J,LI D,et al.Hot topic detection based on arefined TF-IDF algorithm[J].IEEE Access,2019,7:26996-27007.
[13]GU Y J,GUI X L,LI D F,et al.A Survey of Machine Reading Comprehension Based on Neural Networks[J].Journal of Software,2020,31(7):2095-2126.
[14]PANDEY G,CONTRACTOR D,KUMAR V,et al.Exemplar encoder-decoder for neural conversation generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne,Australia:Association for Computational Linguistics,2018:1329-1338.
[15]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[16]PRECHELT L.Automatic early stopping using cross valida-tion:quantifying the criteria[J].Neural Networks,1998,11(4):761-767.
[17]WU Y,WEI F,HUANG S,et al.Response generation by con-text-aware prototype editing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Honolulu,USA:AAAI,2019:7281-7288.
[18]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:a methodfor automatic evaluation of machine translation[C]//Procee-dings of the 40th Annual Meeting on Association for Computational Linguistics.Philadelphia,USA:Association for Computational Linguistics,2002:311-318.
[19]LI J,GALLEY M,BROCKETT C,et al.A diversity-promoting objective function for neural conversation models[C]//Procee-dings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics.San Diego,USA:Association for Computational Linguistics,2016:110-119.
[20]ZHOU Q A,LI Z J.Improved model and tuning method for na-tural language understanding of task-oriented dialogue system based on BERT[J].Journal of Chinese Information Processing,2020,34(5):82-90.
[1] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[2] 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩.
基于Transformer和LSTM的药物相互作用预测
Drug-Drug Interaction Prediction Based on Transformer and LSTM
计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150
[3] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[4] 赵小虎, 叶圣, 李晓.
多算法融合的骨骼重建信息动作分类方法
Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction
计算机科学, 2022, 49(6): 269-275. https://doi.org/10.11896/jsjkx.210500070
[5] 陆亮, 孔芳.
面向对话的融入知识的实体关系抽取
Dialogue-based Entity Relation Extraction with Knowledge
计算机科学, 2022, 49(5): 200-205. https://doi.org/10.11896/jsjkx.210300198
[6] 杨进才, 曹元, 胡泉, 沈显君.
基于Transformer模型与关系词特征的汉语因果类复句关系自动识别
Relation Classification of Chinese Causal Compound Sentences Based on Transformer Model and Relational Word Feature
计算机科学, 2021, 48(6A): 295-298. https://doi.org/10.11896/jsjkx.200500019
[7] 霍帅, 庞春江.
基于Transformer和多通道卷积神经网络的情感分析研究
Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network
计算机科学, 2021, 48(6A): 349-356. https://doi.org/10.11896/jsjkx.200800004
[8] 姚冬, 李舟军, 陈舒玮, 季震, 张锐, 宋磊, 蓝海波.
面向任务的基于深度学习的多轮对话系统与技术
Task-oriented Dialogue System and Technology Based on Deep Learning
计算机科学, 2021, 48(5): 232-238. https://doi.org/10.11896/jsjkx.200600092
[9] 胡妤婕, 常建慧, 张健.
语义区域风格约束下的图像合成
Image Synthesis with Semantic Region Style Constraint
计算机科学, 2021, 48(2): 134-141. https://doi.org/10.11896/jsjkx.200800201
[10] 蒋琪, 苏伟, 谢莹, 周弘安平, 张久文, 蔡川.
基于Transformer的汉字到盲文端到端自动转换
End-to-End Chinese-Braille Automatic Conversion Based on Transformer
计算机科学, 2021, 48(11A): 136-141. https://doi.org/10.11896/jsjkx.210100025
[11] 曹卫东,许志香,王静.
基于深度生成模型的半监督入侵检测算法
Intrusion Detection Based on Semi-supervised Learning with Deep Generative Models
计算机科学, 2019, 46(3): 197-201. https://doi.org/10.11896/j.issn.1002-137X.2019.03.029
[12] 郑文萍,曲瑞,穆俊芳.
具有社区结构的无标度网络生成算法
Generation Algorithm for Scale-free Networks with Community Structure
计算机科学, 2018, 45(2): 76-83. https://doi.org/10.11896/j.issn.1002-137X.2018.02.013
[13] 庞雄文,万本帅,王盼.
基于MRT-LDA模型的微博文本分类
Micro-blog’s Text Classification Based on MRT-LDA
计算机科学, 2017, 44(8): 236-241. https://doi.org/10.11896/j.issn.1002-137X.2017.08.040
[14] 陈静,刘琰,王煦中.
基于概率生成模型的微博话题传播群体划分方法
Group Partition in Topic-related Microblogging Spreading Based on Probability Generation Model
计算机科学, 2016, 43(8): 223-228. https://doi.org/10.11896/j.issn.1002-137X.2016.08.045
[15] 王玉,任福继,全昌勤.
口语对话系统中对话管理方法研究综述
Review of Dialogue Management Methods in Spoken Dialogue System
计算机科学, 2015, 42(6): 1-7. https://doi.org/10.11896/j.issn.1002-137X.2015.06.001
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!