计算机科学 ›› 2021, Vol. 48 ›› Issue (8): 234-239.doi: 10.11896/jsjkx.200700162

所属专题: 自然语言处理 虚拟专题

• 人工智能 • 上一篇    下一篇


杨慧敏, 马廷淮   

  1. 南京信息工程大学计算机与软件学院 南京 210044
  • 收稿日期:2020-07-26 修回日期:2020-09-17 发布日期:2021-08-10
  • 通讯作者: 马廷淮(thma@nuist.edu.cn)
  • 基金资助:

Compound Conversation Model Combining Retrieval and Generation

YANG Hui-min, MA Ting-huai   

  1. College of Computer and Software,Nanjing University of Information Science & Technology,Nanjing 210044,China
  • Received:2020-07-26 Revised:2020-09-17 Published:2021-08-10
  • About author:YANG Hui-min,born in 1997,postgra-duate.Her main research interests include data mining and data sharing.(2432640905@qq.com)MA Ting-huai,born in 1974,Ph.D,professor,is a member of China Computer Federation.His main research interests include data mining,data sharing and privacy protection.
  • Supported by:
    National Natural Science Foundation of China(U1736105).

摘要: 对话模型是自然语言处理的重要方向之一。现如今的对话模型主要分为基于检索的方式和基于生成的方式。然而,检索方式无法回应语料库中未出现的问句,而生成方式容易出现安全回复的问题。鉴于此,提出融合检索与生成的复合对话模型,通过将检索方式与生成方式相结合来弥补各自的缺点。首先通过检索模块得到K个检索上下文以及所对应的K个检索候选回应。在多回应生成模块中进一步结合检索上下文得到若干生成候选回应。最后的候选回应排序模块分为预筛选与后排序两个步骤。预筛选部分通过计算输入问题与候选回应的相似度得到最优检索回应与最优生成回应,后排序部分进一步选出对于输入问题最合适的回答。实验结果显示,相对于传统模型,复合对话模型在BLUE指标上提升了6%,在多样性指标上提升了12%。

关键词: Transformer, 对话系统, 后排序, 检索模型, 生成模型

Abstract: Conversation model is one of the important directions of natural language processing.Today's dialogue models are mainly divided into retrieval-based methods and generation-based methods.However,the retrieval method cannot respond to questions that do not appear in the corpus,and the generation method is prone to problems with safe responses.In view of this,a compound conversation model that combines retrieval and generation is proposed,and the retrieval method and generation method are combined to make up for their shortcomings.First,K retrieval contexts and corresponding K retrieval candidate responses are obtained through the retrieval module.In the multi-response generation module,retrieval contexts are further combined to obtain several generation candidate responses.The candidate response ranking module is divided into two steps:pre-screening and post-reranking.The pre-screening part obtains the optimal retrieval response and the optimal generated response by calculating the similarity between the input question and candidate responses,and the post-reranking part further selects the most suitable answer to the input question.Experimental results show that the BLUE index increased by 6%,and the diversity index increased by 12%.

Key words: Conversation system, Generation model, Post-reranking, Retrieval model, Transformer


  • TP319.1
