计算机科学 ›› 2023, Vol. 50 ›› Issue (10): 176-183.doi: 10.11896/jsjkx.220900201
赵雁斌, 苏锦钿
ZHAO Yanbin, SU Jindian
摘要: 开放域对话系统的关键任务之一是生成丰富多样且连贯的对话回复,但是仅从上文信息进行单向推理无法达到这一目标。针对该问题,提出了基于多隐变量的双向推理模型MLVBI(Multiple Latent Variables Bidirectional Inference)。首先,在语言模型中结合变分自动编码器并将单向推理扩充到双向推理,将语料分割为上文、查询与回复后,使用正向推理从查询中推理出回复用于学习正常语序信息,同时使用反向推理从回复中推理出查询用于学习额外主题信息,最后融合成双向推理,使得模型生成更连贯的回复。其次,针对双向推理过程中单个隐变量解释能力不足的问题,引入多个隐变量进一步提高生成对话的多样性。实验结果表明,MLVBI在两个开放域数据集DailyDialog和PersonalChat上的准确性和多样性都达到了当前最佳的效果,并且消融实验也证明了双向推理和多隐变量的有效性。
中图分类号:
[1]SUTSKEVER I,VINYALS O,LE Q V.Sequence to seq-uencelearning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems(Volume 2).2014:3104-3112. [2]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010. [3]ORIOL V,QUOC V L.A neural conversational model[J].ar-Xiv:1506.05869,2015. [4]LI J,GALLEY M,BROCKETT C,et al.A Diversity-Promoting Objective Function for Neural Conversation Models[C]//Proceedings of NAACL-HLT.2016:110-119. [5]BAHETI A,RITTER A,LI J,et al.Generating More Interes-ting Responses in Neural Conversation Models with Distribu-tional Constraints[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3970-3980. [6]ZHAO T,RAN Z,ESKENAZI M.Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:654-664. [7]BOWMAN S R,VILNIS L,VINYALS O,et al.Generatingsentences from a continuous space[C]//20th SIGNLL Confe-rence on Computational Natural Language Learning(CoNLL 2016).Association for Computational Linguistics(ACL),2016:10-21. [8]FENG S,REN X,CHEN H,et al.Regularizing Dialogue Gene-ration by Imitating Implicit Scenarios[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:6592-6604. [9]ZHOU W,LI Q,LI C.Learning from Perturbations:Diverse and Informative Dialogue Generation with Inverse Adversarial Training[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Vo-lume 1:Long Papers).2021:694-703. [10]LI Z,KISELEVA J,DE RIJKE M.Improving response qualitywith backward reasoning in open-domain dialogue systems[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.2021:1940-1944. [11]BAO S,HE H,WANG F,et al.PLATO-2:Towards Build-ing an Open-Domain Chatbot via Curriculum Learning[C]//Fin-dings of the Association for Computational Linguistics:ACL(IJCNLP 2021).2021:2513-2525. [12]LI C,GAO X,LI Y,et al.Optimus:Organizing Sentences viaPre-trained Modeling of a Latent Space[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:4678-4699. [13]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013. [14]SUN B,FENG S,LI Y,et al.Generating Relevant and Coherent Dialogue Responses using Self-Separated Conditional Variational AutoEncoders[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:5624-5637. [15]ZHAO Y,YU P,MAHAPATRA S,et al.Improve Variational Autoencoder for Text Generation with Discrete Latent Bottleneck[J].arXiv:2004.10603,2020. [16]YOOKOON P,JAEMIN C,AND GUNHEE K.A hierarchical latent structure for variational conversation modeling[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,New Orleans,Louisiana.Association for Computational Linguistics.2018:1792-1801. [17]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [18]SHAO H,XIAO Z,YAO S,et al.ControlVAE:Tuning,Analytical Properties,and Performance Analysis[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2021,44(12):9285-9297. [19]LI Y,SU H,SHEN X,et al.DailyDialog:A Manually Labelled Multi-turn Dialogue Dataset[C]//Proceedings of the Eighth International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2017:986-995. [20]ZHANG S,DINAN E,URBANEK J,et al.Personalizing Dialogue Agents:I have a dog,do you have pets too?[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:2204-2213. [21]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:a method for automatic evaluation of machine translation[C]//Procee-dings of the 40th Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2002:311-318. [22]LIU C W,LOWE R,SERBAN I V,et al.How not to evaluate your dialogue system:An empirical study of unsupervised eva-luation metrics for dialogue response generation[C]//Procee-dings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:2122-2132. [23]CHEN B,CHERRY C.A systematic comparison of smoothing techniques for sentence level BLEU[C]//Proceedings of the Ninth Workshop on Statistical Machine Translation.2014:362-367. [24]RUS V,LINTEAN M.A Comparison of Greedy and OptimalAssessment of Natural Language Student Input Using Word-to-Word Similarity Metrics[C]//Proceedings of the Seventh Workshop on Building Educational Applications Using NLP.Association for Computational Linguistics,2012:157-162. [25]LAPATA M,MITCHELL J.Vector-based models of semanticcomposition[C]//Proceedings of ACL-08:HLT.2008:236-244. [26]FORGUES G,PINEAU J,LARCHEVÊQUE J M,et al.Bootstrapping dialog systems with word embeddings[C]//Nips,Modern Machine Learning and Natural Language Processing Workshop.2014:168-173. [27]PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543. [28]FANG L,LI C,GAO J,et al.Implicit Deep Latent VariableModels for Text Generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:3946-3956. [29]GU X,CHO K,HA J W,et al.Dialogwae:Multimodal response generation with conditional Wasserstein auto-encoder[C]//7th International Conference on Learning Representations.2019:2-12. [30]LI R,LI X,CHEN G,et al.Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation[C]//Proceedings of the 28th International Conference on Computational Linguistics.International Committee on Computational Linguistics,2020:2381-2397. |
|