计算机科学 ›› 2023, Vol. 50 ›› Issue (4): 188-195.doi: 10.11896/jsjkx.220200061
张翔, 毛兴静, 赵容梅, 琚生根
ZHANG Xiang, MAO Xingjing, ZHAO Rongmei, JU Shenggen
摘要: 抽取式自动文本摘要旨在从原文中抽取最能表示全文语义的句子组成摘要,由于具有简单高效的特点被广泛地应用和研究。目前,抽取式摘要模型大多基于句子间的局部关系得到重要性得分,从而选择句子,这种方式忽略了原文的全局语义信息,模型更容易受到局部非重要关系的影响。因此,提出一种融入全局语义信息的抽取式摘要模型。该模型在得到句子和文章的表示后,通过句子级编码器和全局信息提取模块学习句间关系以及全局信息,再将提取到的全局信息融入句向量中,最后得到句子得分以决定其是否为摘要句子。所提模型可以实现端到端的训练,并且在全局信息提取模块采用了基于方面抽取和神经主题模型两种全局信息提取技术。在公开数据集CNN/DailyMail上的实验结果验证了模型融入全局信息的有效性。
中图分类号:
[1]WU R S,WANG H L,WANG Z Q,et al.Short Text Summary Generation with Global Self-matching Mechanism[J].Journal of Software,2019,30(9):2705-2717. [2]DONG Y,SHEN Y K,CRAWFORD E,et al.Banditsum:Ex-tractive summarization as a contextual bandit[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:3739-3748. [3]NALLAPATI R,ZHAI F F,ZHOU B W.Summarunner:A recurrent neural network based sequence model for extractive summarization of documents[C]//Thirty-First AAAI Confe-rence on Artificial Intelligence.2017. [4]ZHONG M,LIU P F,CHEN Y R,et al.Extractive Summarization as Text Matching[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:6197-6208. [5]CARBONELL J,GOLDSTEIN J.The use of MMR,diversity-based reranking for reordering documents and producing summaries[C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.1998:335-336. [6]ERKAN G,RADEV D R.LexRank:Graph-based lexical cen-trality as salience in text summarization[J].Journal of Artificial Intelligence Research,2004,22(1):457-479. [7]LI J P,ZHANG C,CHEN X J,et al.Survey on Automatic Text Summarization[J].Journal of Computer Research and Development,2021,58(1):1-21. [8]MIHALCEA R,TARAU P.TextRank:Bringing order into text[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.2004:404-411. [9]CHALI Y,HASAN S A,JOTY S R A.SVM-based ensembleapproach to multi-document summarization[C]//Canadian Conference on Artificial Intelligence.Berlin:Springer,2009:199-202. [10]VASWANI A,SHAZEER N,PARMARN,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [11]DEVLIN J,CHANG M W,LEEK,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [12]LIU Y.Fine-tune BERT for extractive summarization[J].ar-Xiv:1903.10318,2019. [13]CHENG J P,LAPATA M.Neural summarization by extracting sentences and words[J].arXiv:1603.07252,2016. [14]NARAYAN S,COHEN S B B,LAPATAM.Ranking sentences for extractive summarization with reinforcement learning[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:1747-1759. [15]ZHOU Q Y,YANG N,WEI F R,et al.Neural Document Summarization by Jointly Learning to Score and Select Sentences[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:654-663. [16]WANG H B,JIN Z L,MAO C L.Automatic Text Summarizationof News Texts Combined with Hierarchical Attention[J].Journal of Frontiers of Computer Science and Technology,2021,12(1):1-14. [17]ZHANG X X,WEI F R,ZHOU M.HIBERT:Document level pre-training of hierarchical bidirectional transformers for document summarization[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5059-5069. [18]WANG D Q,LIU P F,ZHENG Y N,et al.Heterogeneous graphneural networks for extractive document summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6209-6219. [19]JING B Y,YOU Z Y,YANG T,et al.Multiplex Graph Neural Network for Extractive Text Summarization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:133-139. [20]BI K P,JHA R,CROFT W B,et al.AREDSUM:Adaptive Redundancy-Aware Iterative Sentence Ranking for Extractive Do-cument Summarization[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:281-291. [21]JIA R P,CAO Y N,FANG F,et al.Deep Differential Amplifier for Extractive Summarization[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:366-376. [22]HE R D,LEE W S,NG H T,et al.An unsupervised neural at-tention model for aspect extraction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:388-397. [23]HE R D,LEE W S,NG H T,et al.An interactive multi-task learning network for end-to-end aspect-based sentiment analysis[J].arXiv:1906.06906,2019. [24]ANGELIDIS S,LAPATA M.Summarizing opinions:Aspect extraction meets sentiment prediction and they are both weakly supervised[J].arXiv:1808.08858,2018. [25]AMPLAYO R K,ANGELIDIS S,LAPATAM.UnsupervisedOpinion Summarization with Content Planning[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:12489-12497. [26]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of machine Learning research,2003,3(Jan):993-1022. [27]ROUL R K,MEHROTRA S,PUNGALIYAY,et al.A new automatic multi-document text summarization using topic mode-ling[C]//International Conference on Distributed Computing and Internet Technology.Cham:Springer,2019:212-221. [28]WANG Z J,DUAN Z B,ZHANG H,et al.Friendly topic assistant for transformer based abstractive summarization[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:485-497. [29]MIAO Y S,YU L,BLUNSOM P.Neural variational inference for text processing[C]//Proceedings of the International Conference on Machine Learning.PMLR,2016:1727-1736. [30]FU X Y,WANG J,ZHANG J H,et al.Document summarization with vhtm:Variational hierarchical topic-aware mechanism[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:7740-7747. [31]CUI P,HU L,LIU Y C.Enhancing extractive text summarization with topic-aware graph neural networks[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:5360-5371. [32]LIU Y H,OTT M,GOYAL N,et al.Roberta:A robustly optimized bert pretraining approach[J].arXiv:1907.11692,2019. [33]HUANG J J,LI P W,PENG M,et al.Review of Deep Learning-based Topic Model[J].Chinese Journal of Computers,2020,43(5):75-103. [34]GAO T Y,YAO X C,CHEN D Q.SimCSE:Simple Contrastive Learning of Sentence Embeddings[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proces-sing.2021:6894-6910. [35]HERMANN K M,KOCISKY T,GREFENSTETTE E,et al.Teaching machines to read and comprehend[C]//Advances in Neural Information Processing Systems.2015:1693-1701. [36]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches out.2004:74-81. [37]NARAYAN S,MAYNEZ J,ADAMEK J,et al.Stepwise extractive summarization and planning with structured transformers[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:4143-4159. |
[1] | 郝敬宇, 文静轩, 刘华锋, 景丽萍, 于剑. 结合全局信息的深度图解耦协同过滤 Deep Disentangled Collaborative Filtering with Graph Global Information 计算机科学, 2023, 50(1): 41-51. https://doi.org/10.11896/jsjkx.220900255 |
[2] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[3] | 毛肖,和丽芳,王庆平. 基于改进萤火虫优化算法的多阈值彩色图像分割 Multilevel Color Image Segmentation Based on Improved Glowworm Swarm Optimization Algorithm 计算机科学, 2017, 44(Z6): 206-211. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.047 |
[4] | 钱国红,黄德才,陆亿红. 广义加权Minkowski距离及量子遗传聚类算法 General Weighted Minkowski Distance and Quantum Genetic Clustering Algorithm 计算机科学, 2013, 40(5): 224-228. |
|