计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230300160-8.doi: 10.11896/jsjkx.230300160

• 人工智能 • 上一篇    下一篇

结合预训练的多文档摘要研究

丁一, 王中卿   

  1. 苏州大学计算机科学与技术学院 江苏 苏州 215006
  • 发布日期:2024-06-06
  • 通讯作者: 王中卿(wangzq@suda.edu.cn)
  • 作者简介:(1959001912@qq.com)

Study on Pre-training Tasks for Multi-document Summarization

DING Yi, WANG Zhongqing   

  1. School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Published:2024-06-06
  • About author:DING Yi,born in 2002,undergraduate.His main research interests include na-tural language processing and multi-document summarization.
    WANG Zhongqing,born in 1987,Ph.D,associate professor.His main research interests include natural language processing,information extraction and emotional analysis.

摘要: 新闻文本摘要任务旨在从庞大复杂的新闻文本中快速准确地提炼出简明扼要的摘要。基于预训练语言模型对多文档摘要进行研究,重点研究结合预训练任务的具体模型训练方式对模型效果提升的作用,强化多文档之间的信息交流,以生成更全面、更简练的摘要。对于结合预训练任务,提出对基线模型、预训练任务内容、预训练任务数量、预训练任务顺序的对比实验,探索标记了行之有效的预训练任务,总结归纳了强化多文档之间的信息交流的具体方法,精炼提出了简明高效的预训练流程。在公开新闻多文档数据集上进行训练和测试,实验结果表明预训练任务的内容、数量、顺序对ROUGE值都有一定提升,并且整合三者结论提出的特定预训练组合对ROUGE值有明显提升。

关键词: 新闻, 摘要, 预训练, 多文档, 信息交流

Abstract: News summarization aims to quickly and accurately extract a concise summary from the complex news text.This paper studies the multi-document summary based on the pre-training language model,focusing on the effect of model training methods combined with pre-training tasks on improving model performance,and strengthening information exchange between multiple documents to generate more comprehensive and brief summaries.For combined pre-training tasks,this paper conducts comparative experiments on the baseline model,pre-training task content,pre-training task quantity,and pre-training task order,explores and marks effective pre-training tasks,summarizes the specific methods to strengthen the information exchange between documents,and refines and proposes a concise and efficient pre-training process.Through training and testing on the public news multi-document dataset,experimental results show that the content,quantity,and order of the pre-training tasks have a certain improvement on the ROUGE value,and the specific pre-training combination proposed by integrating the conclusions of the three has a significant increase in the ROUGE value.

Key words: News, Summarization, Pre-training, Multi-document, Information exchange

中图分类号: 

  • TP391
[1]RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]//Proceedings of Conference on Empirical Methods in Natural Language Proces-sing.2015:379-389.
[2]NALLAPATI R,ZHOU B W,SANTOS CN D,et al.Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.2016:280-290.
[3]SEE A,LIU P J,MANNINGC D.Get to the point:Summarization with pointer-generator networks[C]//Association for Computational Linguistics.2017:1073-1083.
[4]CELIKYILMAZ A,WANG X,HUANG Q Y.Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation[C]//Conference on Computer Vision and Pattern Recognition.2019:6669-6638.
[5]PAULUS R,XIONG C M,SOCHERR.A deep reinforced model for abstractive summarization[J].arXiv preprint,arXiv:1705.04304,2017.
[6]ZHANG H,XU J,WANG J.Pretraining-based natural language generation for text summarization[J].arXiv:1902.09243,2019.
[7]DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[J].Advances in Neural Information Processing Systems,2019,32.
[8]MICHELI V,FLEURET F.Language models are few-shot butlers[J].arXiv:2104.07972,2021.
[9]ZHANG Y,SUN S,GALLEY M,et al.Dialogpt:Large-scalegenerative pre-training for conversational response generation[J].arXiv:1911.00536,2019.
[10]RADEV D R,JING H,STYŚ M,TAM D.Centroid-based Summarization of Multiple Documents.[C]//Information Processing and Management.2004:919-938.
[11]WAN X J,YANG J W.Multi-document Summarization Using Cluster-based Link Analysis.[C]//Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval.2008:299-306.
[12]MANI N,BLOEDORN E.Multi-Document Summarization by Graph Search and Matching.[C]//Proceedings of the 14th National Conference on Artiicial Intelligence and 9th Innovative Applications of Artiicial Intelligence Conference.1997:622-628.
[13]HAGHIGHI R,VANDERWENDE L.Exploring Content Mo-dels for Multi-document Summarization[C]//Proceedings of the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics.2009:362-370.
[14]DEVLIN A,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[15]LIU Y H,OTT M,GOYAL N,et al.RoBERTa:A Robustly Optimized BERT Pretraining Approach[J].arXiv:1907.11692,2019.
[16]MAO Y,QU Y,XIE Y,et al.Multi-document summarization with maximal marginal relevance-guided reinforcement learning[J].arXiv:2010.00117,2020.
[17]ADFORD L,WU J,CHI L D,et al.Language Models are Unsupervised Multitask Learners[C]//OpenAI Blog.2020:1,8,9.
[18]FABBRI A R,LI I,SHE T,et al.Multi-news:A large-scale multi-document summarization dataset and abstractive hierarchical model[J].arXiv:1906.01749,2019.
[19]ARUMAE K,LIU F.Guiding extractive summarization with question-answering rewards[J].arXiv:1904.02321,2019.
[20]ARUMAE K,LIU F.Guiding extractive summarization withquestion-answering rewards[J].arXiv:1904.02321,2019.
[21]ZHANG X,WEI F,ZHOU M.HIBERT:Document level pre-training of hierarchical bidirectional transformers for document summarization[J].arXiv:1905.06566,2019.
[22]ALEXANDER R F,LI I,SHE T W,et al.Multi-News:a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics.2019:1074-1084.
[23]LIN C Y.ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of Workshop on Text Summarization Branches Out,Post-Conference Workshop of ACL.2004:74-81.
[24]XU J C,GAN Z,CHENG Y,et al..Discourse-Aware Neural Extractive Text Summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5021-5031.
[25]DING Y,YAN E,FRAZHO A.PageRank for ranking authors in co-citation networks[J].Journal of the American Society for Information &Technology,2014,60(11):2229-2243.
[26]SHI X J,CHEN Z R,WANG H,et al.Convolutional LSTM Network:A Machine Learining Approach for Precipitation Nowcasting[C]//Advances in Neural Information Processing Systems 28:Annual Conference.2015:802-810.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!