Computer Science ›› 2023, Vol. 50 ›› Issue (8): 150-156.doi: 10.11896/jsjkx.221100128

• Artificial Intelligence • Previous Articles     Next Articles

Text Paraphrase Generation Based on Pre-trained Language Model and Tag Guidance

LIANG Jiayin, XIE Zhipeng   

  1. School of Computer Science,Fudan University,Shanghai 200438,China
  • Received:2022-11-15 Revised:2023-03-21 Online:2023-08-15 Published:2023-08-02
  • About author:LIANG Jiayin,born in 1997,postgra-duate.Her main research interests include nature language processing and paraphrase generation.
    XIE Zhipeng,born in 1976,Ph.D,associate professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include data mining,machine learning and natural language processing.
  • Supported by:
    National Natural Science Foundation of China(62076072).

Abstract: Text paraphrase generation is an important and challenging task in NLP.Some recent works have applied the syntactic structure information of different granularity of sentences to guide the process of paraphrase generation and have achieved fair performance.However,this kind of methods are rather complex and difficult to transfer.Besides,pre-trained language model has shown good performance in various NLP tasks due to knowledge learned.But it has rarely been used in the paraphrase generation task.This paper proposes a paraphrase generation method based on pre-trained language model and tag guidance.The pre-trained language model is fine-tuned to improve the performance of the paraphrase generation task,and a simple tag insertion method is used to provide syntactic structure guidance.Experiment results show that the proposed method outperforms traditional Seq2Seq methods on datasets ParaNMT and Quora.In addition,it also demonstrate its effectiveness in improving downstream tasks by data augmentation.

Key words: Text paraphrase generation, Pre-trained language model, Data augmentation

CLC Number: 

  • TP391
[1]MCKEOWN K.Paraphrasing questions using given and new information[J].American Journal of Computational Linguistics,1983,9(1):1-10.
[2]BARZILAY R,LEE L.Learning to paraphrase:An unsuper-vised approach using multiple-sequence alignment[J].arXiv:preprint cs/0304006,2003.
[3]DONG L,MALLINSON J,REDDY S,et al.Learning to paraphrase for question answering[J].arXiv:1708.06022,2017.
[4]THOMPSON B,POST M.Automatic machine translation eva-luation in many languages via zero-shot paraphrasing[J].arXiv:2004.14564,2020.
[5]GAO S,ZHANG Y,OU Z,et al.Paraphrase augmented task-oriented dialog generation[J].arXiv:2004.07462,2020.
[6]LAN W,QIU S,HE H,et al.A continuously growing dataset of sentential paraphrases[J].arXiv:1708.00391,2017.
[7]WIETING J,GIMPEL K.ParaNMT-50M:Pushing the limits of paraphrastic sentence embeddings with millions of machine translations[J].arXiv:1711.05732,2017.
[8]BOLSHAKOV I A,GELBUKH A.Synonymous paraphrasingusing wordnet and internet[C]//International Conference on Application of Natural Language to Information Systems.Berlin:Springer,2004:312-323.
[9]PRAKASH A,HASAN S A,LEE K,et al.Neural paraphrase generation with stacked residual LSTM networks[J].arXiv:1610.03098,2016.
[10]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[11]LI Z,JIANG X,SHANG L,et al.Decomposable neural para-phrase generation[J].arXiv:1906.09741,2019.
[12]GOYAL T,DURRETT G.Neural syntactic preordering for controlled paraphrase generation[J].arXiv:2005.02013,2020.
[13]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[14]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pretraining[R].Technical Report,OpenAI,2018.
[15]LEWIS M,LIU Y,GOYAL N,et al.Bart:Denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[J].arXiv:1910.13461,2019.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[17]CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[18]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[19]GUPTA A,AGARWAL A,SINGH P,et al.A deep generative framework for paraphrase generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[20]ROY A,GRANGIER D.Unsupervised paraphrasing withouttranslation[J].arXiv:1905.12752,2019.
[21]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[22]YU L,ZHANG W,WANG J,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[23]YANG Q,HUO Z,SHEN D,et al.An end-to-end generative architecture for paraphrase generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019:3132-3142.
[24]VIZCARRA G,OCHOA-LUNA J.Paraphrase generation viaadversarial penalizations[C]//Proceedings of the Sixth Workshop on Noisy User-generated Text(W-NUT 2020).2020:249-259.
[25]IYYER M,WIETING J,GIMPEL K,et al.Adversarial example generation with syntactically controlled paraphrase networks[J].arXiv:1804.06059,2018.
[26]CHEN M,TANG Q,WISEMAN S,et al.Controllable para-phrase generation with a syntactic exemplar[J].arXiv:1906.00565,2019.
[27]KUMAR A,AHUJA K,VADAPALLI R,et al.Syntax-guided controlled generation of paraphrases[J].Transactions of the Association for Computational Linguistics,2020,8:330-345.
[28]KAZEMNEJAD A,SALEHI M,BAGHSHAH M S.Paraphrase generation by learning how to edit from samples[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6010-6021.
[29]HUANG S,WU Y,WEI F,et al.Dictionary-guided editing networks for paraphrase generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):6546-6553.
[30]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long Papers).2018:2227-2237.
[31]SUN H,ZHOU M.Joint learning of a dual SMT system for para-phrase generation[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2012:38-42.
[32]ZHANG T,KISHORE V,WU F,et al.Bertscore:Evaluatingtext generation with bert[J].arXiv:1904.09675,2019.
[33]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[34]RAJPURKAR P,ZHANG J,LOPYREV K,et al.Squad:100 000+ questions for machine comprehension of text[J].ar-Xiv:1606.05250,2016.
[35]ELLIOTT D,FRANK S,SIMA'AN K,et al.Multi30k:Multilingual english-german image descriptions[J].arXiv:1605.00459,2016.
[1] ZENG Wu, MAO Guojun. Few-shot Learning Method Based on Multi-graph Feature Aggregation [J]. Computer Science, 2023, 50(6A): 220400029-10.
[2] HUANG Fangwan, LU Juhong, YU Zhiyong. Data Augmentation for Cardiopulmonary Exercise Time Series of Young HypertensivePatients Based on Active Barycenter [J]. Computer Science, 2023, 50(6A): 211200233-11.
[3] WANG Qingyu, WANG Hairui, ZHU Guifu, MENG Shunjian. Study on SQL Injection Detection Based on FlexUDA Model [J]. Computer Science, 2023, 50(6A): 220600172-6.
[4] YE Han, LI Xin, SUN Haichun. Convolutional Network Entity Missing Detection Method Combined with Gated Mechanism [J]. Computer Science, 2023, 50(5): 262-269.
[5] HUAN Zhigang, JIANG Guoquan, ZHANG Yujian, LIU Liu, LIU Shanshan. Employing Gated Mechanism to Incorporate Multi-features into Chinese Event Coreference Resolution [J]. Computer Science, 2023, 50(3): 291-297.
[6] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[7] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[8] XU Hua-jie, CHEN Yu, YANG Yang, QIN Yuan-zhuo. Semi-supervised Learning Method Based on Automated Mixed Sample Data Augmentation Techniques [J]. Computer Science, 2022, 49(3): 288-293.
[9] QU Xiang-mou, WU Ying-bo, JIANG Xiao-ling. Federated Data Augmentation Algorithm for Non-independent and Identical Distributed Data [J]. Computer Science, 2022, 49(12): 33-39.
[10] ZHU Xiang-yuan, NIE Hong, ZHOU Xu. Pest Identification Method Based on TPH-YOLOv5 Algorithm and Small Sample Learning [J]. Computer Science, 2022, 49(12): 257-263.
[11] ZHANG Zhou, ZHU Jun-guo, YU Zheng-tao. Incorporating Part of Speech and Tonal Features for Vietnamese Grammatical Error Detection [J]. Computer Science, 2022, 49(11): 221-227.
[12] FENG Jun, WEI Da-bao, SU Dong, HANG Ting-ting, LU Jia-min. Survey of Document-level Entity Relation Extraction Methods [J]. Computer Science, 2022, 49(10): 224-242.
[13] XIAN Yan-tuan, GAO Fan-ya, XIANG Yan, YU Zheng-tao, WANG Jian. Improving Low-resource Dependency Parsing Using Multi-strategy Data Augmentation [J]. Computer Science, 2022, 49(1): 73-79.
[14] BAO Yu-xuan, LU Tian-liang, DU Yan-hui, SHI Da. Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation [J]. Computer Science, 2021, 48(7): 77-85.
[15] DING Ling, XIANG Yang. Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion [J]. Computer Science, 2021, 48(5): 202-208.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!