神经问题生成前沿综述

doi:10.11896/jsjkx.201100013

摘要/Abstract

摘要： 问题生成是指机器主动对一段文本进行提问,生成一个自然语言的问题。神经问题生成则是完全采用端到端的训练方式,使用神经网络完成文档和答案到问题的转换,是自然语言处理中一个新兴而又重要的研究方向。文中首先对神经问题生成进行了简单介绍,包括基本概念、主流框架和评价方法。接着介绍了该研究方向的关键问题,包括输入建模、长文本处理、多任务学习、机器学习方法的应用、其他研究问题和改进点。最后,介绍了问题生成和问答系统的关系,以及问题生成的未来研究方向。

关键词: 编码器-解码器模型, 机器阅读理解, 神经问题生成

Abstract: Question generation means that the machine actively asks a natural language question by given a passage.Neural question generation is trained in a completely end-to-end training mode,using neural networks to convert documents and answers to questions,which is an emerging and important research direction in natural language processing.This paper first gives a brief introduction to neural question generation,including basic concepts,mainstream frameworks,and evaluation methods.Then,it introduces the key issues of question generation,including input modeling,long document processing,multi-task learning,and the application of machine learning,other issues and improvements.Finally,it introduces the relationship between question generation and question answering,as well as future research of question generation.

Key words: Encoder-decoder model, Machine reading comprehension, Neural question generation

中图分类号:

TP309

邱嘉作, 熊德意. 神经问题生成前沿综述[J]. 计算机科学, 2021, 48(6): 159-167. https://doi.org/10.11896/jsjkx.201100013

QIU Jia-zuo, XIONG De-yi. Frontiers in Neural Question Generation:A Literature Review[J]. Computer Science, 2021, 48(6): 159-167. https://doi.org/10.11896/jsjkx.201100013

参考文献

[1]PAN L,LEI W,CHUA T S,et al.Recent advances in neuralquestion generation[J].arXiv:1905.08949,2019.
[2]DU X,SHAO J,CARDIE C.Learning to Ask:Neural Question Generation for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2017:1342-1352.
[3]ZHOU Q,YANG N,WEI F,et al.Neural question generation from text:A preliminary study[C]//National CCF Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2017:662-671.
[4]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[5]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[6]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//3rd International Conference on Learning Representations(ICLR 2015).2015.
[7]GU J,LU Z,LI H,et al.Incorporating Copying Mechanism in Sequence-to-Sequence Learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2016:1631-1640.
[8]SEE A,LIU P J,MANNING C D.Get To The Point:Summarization with Pointer-Generator Networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2017:1073-1083.
[9]KINGMA D P,BA J L.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980,2014.
[10]SCIALOM T,PIWOWARSKI B,STAIANO J.Self-attention architectures for answer-agnostic neural question generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:6027-6032.
[11]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1 (Long Papers).2018:2227-2237.
[12]KRIANGCHAIVECH K,WANGPERAWONG A.QuestionGeneration by Transformers[J].arXiv:1909.05017,2019.
[13]KLEIN T,NABI M.Learning to Answer by Learning to Ask:Getting the Best of GPT-2 and BERT Worlds[J].arXiv:1911.02365,2019.
[14]MISHRA S K,GOEL P,SHARMA A,et al.Towards Automa-tic Generation of Questions from Long Answers[J].arXiv:2004.05109,2020.
[15]LOPEZ L E,CRUZ D K,CRUZ J C B,et al.Transformer-based End-to-End Question Generation[J].arXiv:2005.01107,2020.
[16]MITKOV R.Computer-aided generation of multiple-choice tests[C]//Proceedings of the HLT-NAACL 03 Workshop on Buil-ding Educational Applications Using Natural Language Proces-sing.2003:17-22.
[17]HEILMAN M,SMITH N A.Good question! statistical ranking for question generation[C]//Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.2010:609-617.
[18]DHOLE K D,MANNING C D.Syn-QG:Syntactic and Shallow Semantic Rules for Question Generation[J].arXiv:2004.08694,2020.
[19]FABBRI A R,NG P,WANG Z,et al.Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering[J].arXiv:2004.11892,2020.
[20]RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100000+ Questions for Machine Comprehension of Text[C]//EMNLP.2016.
[21]GUO H,PASUNURU R,BANSAL M.Soft Layer-SpecificMulti-Task Summarization with Entailment and Question Ge-neration[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:687-697.
[22]SONG L,WANG Z,HAMZA W,et al.Leveraging context information for natural question generation[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 2 (Short Papers).2018:569-574.
[23]HARRISON V,WALKER M.Neural Generation of DiverseQuestions Using Answer Focus,Contextual and Linguistic Features[C]//Proceedings of the 11th International Conference on Natural Language Generation.2018:296-306.
[24]DU X,CARDIE C.Harvesting Paragraph-level Question-An-swer Pairs from Wikipedia[C]//Proceedings of the 56th An-nual Meeting of the Association for Computational Linguistics.2018:1907-1917.
[25]SUN X,LIU J,LYU Y,et al.Answer-focused and position-aware neural question generation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:3930-3939.
[26]KUMAR V,RAMAKRISHNAN G,LI Y F.A framework for automatic question generation from text using deep reinforcement learning[J].arXiv:1808.04961,2018.
[27]KIM Y,LEE H,SHIN J,et al.Improving neural question gene-ration using answer separation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33:6602-6609.
[28]ZHOU W,ZHANG M,WU Y.Multi-Task Learning with Language Modeling for Question Generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:3385-3390.
[29]ZHOU W,ZHANG M,WU Y.Question-type Driven Question Generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:6034-6039.
[30]MA X,ZHU Q,ZHOU Y,et al.Improving Question Generation with Sentence-Level Semantic Matching and Answer Position Inferring[C]//AAAI.2020:8464-8471.
[31]LI J,GAO Y,BING L,et al.Improving Question GenerationWith to the Point Context[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:3207-3217.
[32]ZHAO Y,NI X,DING Y,et al.Paragraph-level neural question generation with maxout pointer and gated self-attention networks[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3901-3910.
[33]LIU B,ZHAO M,NIU D,et al.Learning to generate questions by learning what not to generate[C]//The World Wide Web Conference.2019:1106-1118.
[34]ZENG H,ZHI Z,LIU J,et al.Extended Answer and Uncertainty Aware Neural Question Generation[J].arXiv:1911.08112,2019.
[35]NEMA P,MOHANKUMAR A K,KHAPRA M M,et al.Let’s Ask Again:Refine Network for Automatic Question Generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:3305-3314.
[36]TUAN L A,SHAH D J,BARZILAY R.Capturing GreaterContext for Question Generation[J].arXiv:1910.10274,2019.
[37]CHEN Y,WU L,ZAKI M J.Reinforcement Learning BasedGraph-to-Sequence Model for Natural Question Generation[C]//International Conference on Learning Representations.2019.
[38]ZHANG S,BANSAL M.Addressing Semantic Drift in Question Generation for Semi-supervised Question Answering[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Confe-rence on Natural Language Processing (EMNLP-IJCNLP).2019:2495-2509.
[39]TRISCHLER A,WANG T,YUAN X,et al.NewsQA:A Ma-chine Comprehension Dataset[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP.2017:191-200.
[40]YANG Y,YIH W,MEEK C.Wikiqa:A challenge dataset foropen-domain question answering[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Proces-sing.2015:2013-2018.
[41]NGUYEN T,ROSENBERG M,SONG X,et al.MS MARCO:A Human Generated MAchine Reading COmprehension Dataset[J].Choice,2016,2640:660.
[42]REDDY S,CHEN D,MANNING C D.Coqa:A conversational question answering challenge[J].Transactions of the Association for Computational Linguistics,2019,7:249-266.
[43]YANG Z,QI P,ZHANG S,et al.HotpotQA:A Dataset for Diverse,Explainable Multi-hop Question Answering[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:2369-2380.
[44]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:a method for automatic evaluation of machine translation[C]//Procee-dings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318.
[45]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out.2004:74-81.
[46]BANERJEE S,LAVIE A.METEOR:An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.2005:65-72.
[47]NEMA P,KHAPRA M M.Towards a Better Metric for Evaluating Question Generation Systems[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:3950-3959.
[48]KUMAR V,CHAKI R,TALLURI S T,et al.Question Generation from Paragraphs:A Tale of Two Hierarchical Models[J].arXiv:1911.03407,2019.
[49]KANG J,ROMAN H P S.Let Me Know What to Ask:Interrogative-Word-Aware Question Generation[C]//Proceedings of the 2nd Workshop on Machine Reading for Question Answe-ring.2019:163-171.
[50]YUAN X,WANG T,GULCEHRE C,et al.Machine Comprehension by Text-to-Text Neural Question Generation[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP.2017:15-25.
[51]HOSKING T,RIEDEL S.Evaluating Rewards for QuestionGeneration Models[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:2278-2283.
[52]PAN L,XIE Y,FENG Y,et al.Semantic Graphs for Generating Deep Questions[J].arXiv:2004.12704,2020.
[53]SHINODA K,AIZAWA A.Variational Question-Answer Pair Generation for Machine Reading Comprehension[J].arXiv:2004.03238,2020.
[54]QIU J,XIONG D.Generating Highly Relevant Questions[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:5985-5989.
[55]GUPTA D,SULEMAN K,ADADA M,et al.Improving Neural Question Generation using World Knowledge[J].arXiv:1909.03716,2019.
[56]TANG D,DUAN N,YAN Z,et al.Learning to collaborate for question answering and asking[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:1564-1574.
[57]DUAN N,TANG D,CHEN P,et al.Question generation for question answering[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:866-874.
[58]WANG T,YUAN X,TRISCHLER A.A joint model for question answering and question generation[J].arXiv:1706.01450,2017.
[59]SACHAN M,XING E.Self-training for jointly learning to ask and answer questions[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:629-640.
[60]GOLUB D,HUANG P S,HE X,et al.Two-Stage SynthesisNetworks for Transfer Learning in Machine Comprehension[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:835-844.
[61]KUMAR V,BOORLA K,MEENA Y,et al.Automating reading comprehension by generating question and answer pairs[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mi-ning.Springer:Cham,2018:335-348.
[62]SUBRAMANIAN S,WANG T,YUAN X,et al.Neural Models for Key Phrase Extraction and Question Generation[C]//Proceedings of the Workshop on Machine Reading for Question Answering.2018:78-88.
[63]DU X,CARDIE C.Identifying where to focus in reading comprehension for neural question generation[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:2067-2073.
[64]SHEN S,LI Y,DU N,et al.On the Generation of Medical Question-Answer Pairs[J].arXiv:1811.00681,2018.
[65]LIU B,WEI H,NIU D,et al.Asking Questions the Human Way:Scalable Question-Answer Generation from Text Corpus[C]//Proceedings of The Web Conference 2020.2020:2032-2043.
[66]YANG Z,HU J,SALAKHUTDINOV R,et al.Semi-Supervised QA with Generative Domain-Adaptive Nets[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1040-1050.
[67]GAO Y,LI P,KING I,et al.Interconnected Question Generation with Coreference Alignment and Conversation FlowMode-ling[C]//Proceedings of the 57th Annual Meeting of the Asso-ciation for Computational Linguistics.2019:4853-4862.
[68]PAN B,LI H,YAO Z,et al.Reinforced Dynamic Reasoning for Conversational Question Generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:2114-2124.
[69]GUPTA D,CHAUHAN H,EKBAL A,et al.Reinforced Multi-task Approach for Multi-hop Question Generation[J].arXiv:2004.02143.
[70]YU Q,BING L,ZHANG Q,et al.Based Question Generation with Adaptive Instance Transfer and Augmentation[J].arXiv:1911.01556,2019.
[71]GAO Y,WANG J,BING L,et al.Difficulty controllable ques-tion generation for reading comprehension[J].arXiv:1807.03586,2018.
[72]CHO W S,ZHANG Y,RAO S,et al.Generating a CommonQuestion from Multiple Documents using Multi-source Encoder-Decoder Models[C]//Proceedings of the 3rd Workshop on Neural Generation and Translation.2019:32-43.
[73]SCIALOM T,STAIANO J.Ask to Learn:A Study on Curiosity-driven Question Generation[J].arXiv:1911.03350,2019.
[74]KUMAR V,JOSHI N,MUKHERJEE A,et al.Cross-LingualTraining for Automatic Question Generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computatio-nal Linguistics.2019:4863-4872.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed