计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 150-156.doi: 10.11896/jsjkx.221100128

• 人工智能 • 上一篇    下一篇

基于预训练语言模型和标签指导的文本复述生成方法

梁佳音, 谢志鹏   

  1. 复旦大学计算机科学技术学院 上海 200438
  • 收稿日期:2022-11-15 修回日期:2023-03-21 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 谢志鹏(xiezp@fudan.edu.cn)
  • 作者简介:(liangjy20@fudan.edu.cn)
  • 基金资助:
    国家自然科学基金(62076072)

Text Paraphrase Generation Based on Pre-trained Language Model and Tag Guidance

LIANG Jiayin, XIE Zhipeng   

  1. School of Computer Science,Fudan University,Shanghai 200438,China
  • Received:2022-11-15 Revised:2023-03-21 Online:2023-08-15 Published:2023-08-02
  • About author:LIANG Jiayin,born in 1997,postgra-duate.Her main research interests include nature language processing and paraphrase generation.
    XIE Zhipeng,born in 1976,Ph.D,associate professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include data mining,machine learning and natural language processing.
  • Supported by:
    National Natural Science Foundation of China(62076072).

摘要: 文本复述生成是自然语言处理中一项重要且具有挑战性的任务。最近很多工作将句子不同粒度的句法结构信息用于指导复述生成过程,取得了一定的效果,然而这些方法大多比较复杂,迁移使用困难。另外,预训练语言模型因学习到大量语言知识而在各项自然语言处理任务中表现出了较好的性能,然而将其用在复述生成任务中的工作较少。针对这些问题,文中提出了基于预训练语言模型和标签指导的复述生成方法。该方法在复述任务上微调预训练语言模型以提升效果,同时用简单的标签插入方式为复述生成模型提供句法结构指导。实验结果表明,这种标签插入结合预训练语言模型的方法在ParaNMT和Quora数据集上的性能优于传统Seq2Seq方法,并且用该方法做数据增强能为下游任务带来效果提升。

关键词: 文本复述生成, 预训练语言模型, 数据增强

Abstract: Text paraphrase generation is an important and challenging task in NLP.Some recent works have applied the syntactic structure information of different granularity of sentences to guide the process of paraphrase generation and have achieved fair performance.However,this kind of methods are rather complex and difficult to transfer.Besides,pre-trained language model has shown good performance in various NLP tasks due to knowledge learned.But it has rarely been used in the paraphrase generation task.This paper proposes a paraphrase generation method based on pre-trained language model and tag guidance.The pre-trained language model is fine-tuned to improve the performance of the paraphrase generation task,and a simple tag insertion method is used to provide syntactic structure guidance.Experiment results show that the proposed method outperforms traditional Seq2Seq methods on datasets ParaNMT and Quora.In addition,it also demonstrate its effectiveness in improving downstream tasks by data augmentation.

Key words: Text paraphrase generation, Pre-trained language model, Data augmentation

中图分类号: 

  • TP391
[1]MCKEOWN K.Paraphrasing questions using given and new information[J].American Journal of Computational Linguistics,1983,9(1):1-10.
[2]BARZILAY R,LEE L.Learning to paraphrase:An unsuper-vised approach using multiple-sequence alignment[J].arXiv:preprint cs/0304006,2003.
[3]DONG L,MALLINSON J,REDDY S,et al.Learning to paraphrase for question answering[J].arXiv:1708.06022,2017.
[4]THOMPSON B,POST M.Automatic machine translation eva-luation in many languages via zero-shot paraphrasing[J].arXiv:2004.14564,2020.
[5]GAO S,ZHANG Y,OU Z,et al.Paraphrase augmented task-oriented dialog generation[J].arXiv:2004.07462,2020.
[6]LAN W,QIU S,HE H,et al.A continuously growing dataset of sentential paraphrases[J].arXiv:1708.00391,2017.
[7]WIETING J,GIMPEL K.ParaNMT-50M:Pushing the limits of paraphrastic sentence embeddings with millions of machine translations[J].arXiv:1711.05732,2017.
[8]BOLSHAKOV I A,GELBUKH A.Synonymous paraphrasingusing wordnet and internet[C]//International Conference on Application of Natural Language to Information Systems.Berlin:Springer,2004:312-323.
[9]PRAKASH A,HASAN S A,LEE K,et al.Neural paraphrase generation with stacked residual LSTM networks[J].arXiv:1610.03098,2016.
[10]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[11]LI Z,JIANG X,SHANG L,et al.Decomposable neural para-phrase generation[J].arXiv:1906.09741,2019.
[12]GOYAL T,DURRETT G.Neural syntactic preordering for controlled paraphrase generation[J].arXiv:2005.02013,2020.
[13]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[14]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pretraining[R].Technical Report,OpenAI,2018.
[15]LEWIS M,LIU Y,GOYAL N,et al.Bart:Denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[J].arXiv:1910.13461,2019.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[17]CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[18]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[19]GUPTA A,AGARWAL A,SINGH P,et al.A deep generative framework for paraphrase generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[20]ROY A,GRANGIER D.Unsupervised paraphrasing withouttranslation[J].arXiv:1905.12752,2019.
[21]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[22]YU L,ZHANG W,WANG J,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[23]YANG Q,HUO Z,SHEN D,et al.An end-to-end generative architecture for paraphrase generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:3132-3142.
[24]VIZCARRA G,OCHOA-LUNA J.Paraphrase generation viaadversarial penalizations[C]//Proceedings of the Sixth Workshop on Noisy User-generated Text(W-NUT 2020).2020:249-259.
[25]IYYER M,WIETING J,GIMPEL K,et al.Adversarial example generation with syntactically controlled paraphrase networks[J].arXiv:1804.06059,2018.
[26]CHEN M,TANG Q,WISEMAN S,et al.Controllable para-phrase generation with a syntactic exemplar[J].arXiv:1906.00565,2019.
[27]KUMAR A,AHUJA K,VADAPALLI R,et al.Syntax-guided controlled generation of paraphrases[J].Transactions of the Association for Computational Linguistics,2020,8:330-345.
[28]KAZEMNEJAD A,SALEHI M,BAGHSHAH M S.Paraphrase generation by learning how to edit from samples[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6010-6021.
[29]HUANG S,WU Y,WEI F,et al.Dictionary-guided editing networks for paraphrase generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):6546-6553.
[30]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long Papers).2018:2227-2237.
[31]SUN H,ZHOU M.Joint learning of a dual SMT system for para-phrase generation[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2012:38-42.
[32]ZHANG T,KISHORE V,WU F,et al.Bertscore:Evaluatingtext generation with bert[J].arXiv:1904.09675,2019.
[33]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[34]RAJPURKAR P,ZHANG J,LOPYREV K,et al.Squad:100 000+ questions for machine comprehension of text[J].ar-Xiv:1606.05250,2016.
[35]ELLIOTT D,FRANK S,SIMA'AN K,et al.Multi30k:Multilingual english-german image descriptions[J].arXiv:1605.00459,2016.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!