计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 232-238.doi: 10.11896/jsjkx.210200153

• 人工智能 • 上一篇    下一篇

一种会话理解模型的问题生成方法

时雨涛, 孙晓   

  1. 合肥工业大学计算机与信息学院 合肥230601
    合肥工业大学情感计算与先进智能机器安徽省重点实验室 合肥230601
  • 收稿日期:2021-02-24 修回日期:2021-06-13 出版日期:2022-03-15 发布日期:2022-03-15
  • 通讯作者: 孙晓(sunx@hfut.edu.cn)
  • 作者简介:(2019110984@mail.hfut.edu.cn)
  • 基金资助:
    国家自然科学基金(61976078)

Conversational Comprehension Model for Question Generation

SHI Yu-tao, SUN Xiao   

  1. School of Computer and Information,Hefei University of Technology,Hefei 230601,ChinaKey Laboratory of Affective Computing and Advanced Intelligent Machines of Anhui Province,Hefei University of Technology,Hefei 230601, China
  • Received:2021-02-24 Revised:2021-06-13 Online:2022-03-15 Published:2022-03-15
  • About author:SHI Yu-tao,born in 1997,postgraduate.His main research interests include na-tural language processing and machine learning.
    SUN Xiao,born in 1980,Ph.D,professor,is a member of China Computer Federation.His main research interests include affective computing,natural language processing,machine learning and human-machine interactions.
  • Supported by:
    National Natural Science Foundation of China(61976078).

摘要: 会话问题生成(Conversational Question Generation,CQG)不同于根据段落和答案生成单轮问题的问题生成任务,CQG额外考虑由历史问答对构成的会话信息,生成的问题承接会话历史内容,保持较高的一致性。针对这一特性,文中提出了字级别和句级别注意力机制模块来增强对会话历史信息的提取能力,确保当前轮次的问题融合会话历史中每个词和句子的特征,从而生成连贯的、高质量的问题。疑问词的正确性较重要,生成的问题需要和数据集中原始问题对应的答案类型相互匹配,在疑问词预测模块中构造额外的损失函数作为疑问词类型的限制。综合各个模块得到会话理解模型(Conversational Comprehension Network,CCNet),实验结果表明,该模型在大部分评测指标上高于基线模型,在CoQA数据集上Bleu1和Bleu2分别达到39.70和23.76,生成的问题质量更高。在消融实验和跨数据集实验中该模型被证明是有效的,说明CCNet模型具有较强的通用能力。

关键词: 会话问题生成, 门控网络, 问题生成, 循环神经网络, 注意力机制

Abstract: Conversational question generation (CQG) is different from the question generation task of generating single-round questions based on paragraphs and answers.CQG additionally considers the conversational information composed of historical question and answer pairs,and the generated questions inherit the historical content of the conversation and maintain high consistency.In response to this feature,the article proposes word-level and sentence-level attention mechanism modules to enhance the ability to extract conversation history information,ensuring that the current round of questions integrates the characteristics of each word and sentence in the conversation history,thereby generating a coherent,high-quality question.The accuracy of the question word is more important.The generated question needs to match the answer type corresponding to the original question in the data set.An additional loss function is constructed in the question word prediction module as a limitation of the question word type.The conversational comprehension network (CCNet) model is obtained by synthesizing each module.Experiments show that this model is higher than the baseline model in most evaluation indicators.On the CoQA dataset,Bleu1 and Bleu2 reach 39.70 and 23.76,respectively,and the quality of the generated questions is higher.The model is proved to be effective in ablation experiments and cross-dataset experiments,indicating that the CCNet model has strong general capabilities.

Key words: Attention mechanism, Conversational question generation, Gated network, Question generation, Recurrent neural network

中图分类号: 

  • TP391
[1]SERBAN I V,GARCIA-DURAN A,GULCEHRE C,et al.Ge-nerating Factoid Questions With Recurrent Neural Networks:The 30M Factoid Question-Answer Corpus[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:588-598.
[2]GUO D,SUN Y,TANG D,et al.Question Generation fromSQL Queries Improves Neural Semantic Parsing[C]//Procee-dings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1597-1607.
[3]DU X,SHAO J,CARDIE C.Learning to Ask:Neural Question Generation for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1342-1352.
[4]DU X,CARDIE C.Harvesting Paragraph-level Question-An-swer Pairs from Wikipedia[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:1907-1917.
[5]WANG J,LIU J,BI W,et al.Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:4100-4110.
[6]GAO Y,LI P,KING I,et al.Interconnected Question Generation with Coreference Alignment and Conversation Flow Mode-ling [C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:4853-4862.
[7]REDDY S,CHEN D,MANNINGC D.CoQA:A Conversational Question Answering Challenge[J].Transactions of the Association for Computational Linguistics,2019,7:249-266.
[8]CHOI E,HE H,IYYER M,et al.QuAC:Question Answering in Context[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:2174-2184.
[9]PAN B,LI H,YAO Z,et al.Reinforced Dynamic Reasoning for Conversational Question Generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:2114-2124.
[10]VANDERWENDE L.Answering and Questioning for Machine Reading[C]//AAAI Spring Symposium:Machine Reading.2007:91.
[11]HEILMAN M,SMITH N A.Good question! statistical ranking for question generation[C]//Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.2010:609-617.
[12]SUTSKEVER I,VINYALS O,LEQ V.Sequence to sequencelearning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.2014:3104-3112.
[13]LUONG M T,PHAM H,MANNINGC D.Effective Approaches to Attention-based Neural Machine Translation[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421.
[14]ZHOU Q,YANG N,WEI F,et al.Neural question generation from text:A preliminary study[C]//National CCF Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2017:662-671.
[15]SUBRAMANIAN S,WANG T,YUAN X,et al.Neural Models for Key Phrase Extraction and Question Generation[C]//Proceedings of the Workshop on Machine Reading for Question Answering.2018:78-88.
[16]DUAN N,TANG D,CHEN P,et al.Question generation for question answering[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:866-874.
[17]SERBAN I V,SORDONI A,BENGIO Y,et al.Building end-to-end dialogue systems using generative hierarchical neural network models[C]//Proceedings of the Thirtieth AAAI Confe-rence on Artificial Intelligence.2016:3776-3783.
[18]XING C,WU Y,WU W,et al.Hierarchical recurrent attentionnetwork for response generation[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence.2018:5610-5617.
[19]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[20]XIONG C,ZHONG V,SOCHER R.Dynamic coattention networks for question answering [J].arXiv:1611.01604,2016.
[21]SEO M,KEMBHAVI A,FARHADI A,et al.Bidirectional attention flow for machine comprehension[J].arXiv:1611.01603,2016.
[22]SEE A,LIU P J,MANNINGC D.Get To The Point:Summarization with Pointer-Generator Networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1073-1083.
[23]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[24]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318.
[25]BANERJEE S,LAVIE A.METEOR:An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.2005:65-72.
[26]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out.2004:74-81.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[12] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[13] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[14] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[15] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!