Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230300160-8.doi: 10.11896/jsjkx.230300160

• Artificial Intelligenc • Previous Articles     Next Articles

Study on Pre-training Tasks for Multi-document Summarization

DING Yi, WANG Zhongqing   

  1. School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Published:2024-06-06
  • About author:DING Yi,born in 2002,undergraduate.His main research interests include na-tural language processing and multi-document summarization.
    WANG Zhongqing,born in 1987,Ph.D,associate professor.His main research interests include natural language processing,information extraction and emotional analysis.

Abstract: News summarization aims to quickly and accurately extract a concise summary from the complex news text.This paper studies the multi-document summary based on the pre-training language model,focusing on the effect of model training methods combined with pre-training tasks on improving model performance,and strengthening information exchange between multiple documents to generate more comprehensive and brief summaries.For combined pre-training tasks,this paper conducts comparative experiments on the baseline model,pre-training task content,pre-training task quantity,and pre-training task order,explores and marks effective pre-training tasks,summarizes the specific methods to strengthen the information exchange between documents,and refines and proposes a concise and efficient pre-training process.Through training and testing on the public news multi-document dataset,experimental results show that the content,quantity,and order of the pre-training tasks have a certain improvement on the ROUGE value,and the specific pre-training combination proposed by integrating the conclusions of the three has a significant increase in the ROUGE value.

Key words: News, Summarization, Pre-training, Multi-document, Information exchange

CLC Number: 

  • TP391
[1]RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]//Proceedings of Conference on Empirical Methods in Natural Language Proces-sing.2015:379-389.
[2]NALLAPATI R,ZHOU B W,SANTOS CN D,et al.Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.2016:280-290.
[3]SEE A,LIU P J,MANNINGC D.Get to the point:Summarization with pointer-generator networks[C]//Association for Computational Linguistics.2017:1073-1083.
[4]CELIKYILMAZ A,WANG X,HUANG Q Y.Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation[C]//Conference on Computer Vision and Pattern Recognition.2019:6669-6638.
[5]PAULUS R,XIONG C M,SOCHERR.A deep reinforced model for abstractive summarization[J].arXiv preprint,arXiv:1705.04304,2017.
[6]ZHANG H,XU J,WANG J.Pretraining-based natural language generation for text summarization[J].arXiv:1902.09243,2019.
[7]DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[J].Advances in Neural Information Processing Systems,2019,32.
[8]MICHELI V,FLEURET F.Language models are few-shot butlers[J].arXiv:2104.07972,2021.
[9]ZHANG Y,SUN S,GALLEY M,et al.Dialogpt:Large-scalegenerative pre-training for conversational response generation[J].arXiv:1911.00536,2019.
[10]RADEV D R,JING H,STYŚ M,TAM D.Centroid-based Summarization of Multiple Documents.[C]//Information Processing and Management.2004:919-938.
[11]WAN X J,YANG J W.Multi-document Summarization Using Cluster-based Link Analysis.[C]//Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval.2008:299-306.
[12]MANI N,BLOEDORN E.Multi-Document Summarization by Graph Search and Matching.[C]//Proceedings of the 14th National Conference on Artiicial Intelligence and 9th Innovative Applications of Artiicial Intelligence Conference.1997:622-628.
[13]HAGHIGHI R,VANDERWENDE L.Exploring Content Mo-dels for Multi-document Summarization[C]//Proceedings of the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics.2009:362-370.
[14]DEVLIN A,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[15]LIU Y H,OTT M,GOYAL N,et al.RoBERTa:A Robustly Optimized BERT Pretraining Approach[J].arXiv:1907.11692,2019.
[16]MAO Y,QU Y,XIE Y,et al.Multi-document summarization with maximal marginal relevance-guided reinforcement learning[J].arXiv:2010.00117,2020.
[17]ADFORD L,WU J,CHI L D,et al.Language Models are Unsupervised Multitask Learners[C]//OpenAI Blog.2020:1,8,9.
[18]FABBRI A R,LI I,SHE T,et al.Multi-news:A large-scale multi-document summarization dataset and abstractive hierarchical model[J].arXiv:1906.01749,2019.
[19]ARUMAE K,LIU F.Guiding extractive summarization with question-answering rewards[J].arXiv:1904.02321,2019.
[20]ARUMAE K,LIU F.Guiding extractive summarization withquestion-answering rewards[J].arXiv:1904.02321,2019.
[21]ZHANG X,WEI F,ZHOU M.HIBERT:Document level pre-training of hierarchical bidirectional transformers for document summarization[J].arXiv:1905.06566,2019.
[22]ALEXANDER R F,LI I,SHE T W,et al.Multi-News:a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics.2019:1074-1084.
[23]LIN C Y.ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of Workshop on Text Summarization Branches Out,Post-Conference Workshop of ACL.2004:74-81.
[24]XU J C,GAN Z,CHENG Y,et al..Discourse-Aware Neural Extractive Text Summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5021-5031.
[25]DING Y,YAN E,FRAZHO A.PageRank for ranking authors in co-citation networks[J].Journal of the American Society for Information &Technology,2014,60(11):2229-2243.
[26]SHI X J,CHEN Z R,WANG H,et al.Convolutional LSTM Network:A Machine Learining Approach for Precipitation Nowcasting[C]//Advances in Neural Information Processing Systems 28:Annual Conference.2015:802-810.
[1] LIU Guang, YI Hong. Deep Learning Prediction of Stock Market Combining Media Information and Signal Decomposition [J]. Computer Science, 2024, 51(6A): 230600102-12.
[2] GUI Haitao, WANG Zhongqing. Personalized Dialogue Response Generation Combined with Conversation State Information [J]. Computer Science, 2024, 51(6A): 230800055-7.
[3] SHI Jiyun, ZHANG Chi, WANG Yuqiao, LUO Zhaojing, ZHANG Meihui. Generation of Structured Medical Reports Based on Knowledge Assistance [J]. Computer Science, 2024, 51(6): 317-324.
[4] CHEN Wenzhong, CHEN Hongmei, ZHOU Lihua, FANG Yuan. Time-aware Pre-training Method for Sequence Recommendation [J]. Computer Science, 2024, 51(5): 45-53.
[5] ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong. Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion [J]. Computer Science, 2024, 51(5): 208-215.
[6] ZHANG Mingdao, ZHOU Xin, WU Xiaohong, QING Linbo, HE Xiaohai. Unified Fake News Detection Based on Semantic Expansion and HDGCN [J]. Computer Science, 2024, 51(4): 299-306.
[7] DUAN Yuxiao, HU Yanli, GUO Hao, TAN Zhen, XIAO Weidong. Study on Improved Fake Information Detection Method Based on Cross-modal CorrelationAmbiguity Learning [J]. Computer Science, 2024, 51(4): 307-313.
[8] WU Jiawei, FANG Quan, HU Jun, QIAN Shengsheng. Pre-training of Heterogeneous Graph Neural Networks for Multi-label Document Classification [J]. Computer Science, 2024, 51(1): 143-149.
[9] TANG Jia, GUO Yan, YE Mingwei, WU Guixing. Multimodal Pre-training Method for Multi-view Contrastive Learning and Semantic Enhancement [J]. Computer Science, 2024, 51(1): 168-174.
[10] YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[11] CAI Haoran, YANG Jian, YANG Lin, LIU Cong. Low-resource Thai Speech Synthesis Based on Alternate Training and Pre-training [J]. Computer Science, 2023, 50(6A): 220800127-5.
[12] ZHAO Jiangjiang, WANG Yang, XU Yingying, GAO Yang. Extractive Automatic Summarization Model Based on Knowledge Distillation [J]. Computer Science, 2023, 50(6A): 210300179-7.
[13] WANG Taiyan, PAN Zulie, YU Lu, SONG Jingbin. Binary Code Similarity Detection Method Based on Pre-training Assembly Instruction Representation [J]. Computer Science, 2023, 50(4): 288-297.
[14] ZHANG Xiang, MAO Xingjing, ZHAO Rongmei, JU Shenggen. Study on Extractive Summarization with Global Information [J]. Computer Science, 2023, 50(4): 188-195.
[15] LIU Zhe, YIN Chengfeng, LI Tianrui. Chinese Spelling Check Based on BERT and Multi-feature Fusion Embedding [J]. Computer Science, 2023, 50(3): 282-290.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!