基于多步句子选择-重写模型生成科技文献创新点

doi:10.11896/jsjkx.230800080

Abstract

Abstract: There has been a significant surge in the number of scientific papers published in recent years,which makes it challen-ging for researchers to keep up with the latest advancements in their fields.To stay updated,researchers often rely on reading the contributions section of papers,which serves as a concise summary of the key research findings.However,it is not uncommon for authors to inadequately present the innovative content of their articles,making it difficult for readers to quickly grasp the essence of the research.To address this issue,we propose a novel task of contribution summarization to automatically generate contribution summaries of scientific papers.One of the challenges of this task is the lack of relevant datasets.Therefore,we construct a scientific contribution summarization corpus(SCSC).Another issue lies in the fact that currently available abstractive or extractive models tend to suffer from either excessive redundancy or a lack of coherence between sentences.To meet the demand of ge-nerating concise and high-quality contribution sentences,we present MSSRsum,a multi-step sentence selecting-and-rewriting model.Experiments show that the proposed model outperforms baselines on SCSC and arXiv datasets.

Key words: Summarization, Scientific papers, Multi-step sentence selecting-and-rewriting, Generation of contributions

CLC Number:

TP391

XU Xianzhe, CHEN Jingqiang. Generation of Contributions of Scientific Paper Based on Multi-step Sentence Selecting-and-Rewriting Model[J].Computer Science, 2024, 51(10): 344-350.

References

[1]YU T Z,SU D,DAI W L,et al.Dimsum @LaySumm 20[C]//Proceedings of the First Workshop on Scholarly Document Processing,Online:Association for Computational Linguistics.2020:303-309.
[2]CAGLIERO L,LA QUATRA M.Extracting highlights of scientific articles:A supervised summarization approach [J].Expert Systems with Applications,2020,160:113659.
[3]NARAYAN S,COHEN S B,LAPATA M.Ranking sentencesfor extractive summarization with reinforcement learning[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:Association for Computational Linguistics,2018:1747-1759.
[4]ZHANG S Y,DAVID W,MOHIT B.Extractive is not Faithful:An Investigation of Broad Unfaithfulness Problems in Extractive Summarization[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.Toronto,Canada:Association for Computational Linguistics,2023:2153-2174.
[5]AKANKSHA J,EDUARDO F,ENRIQUE A,et al.Deep-Summ:Exploiting topic models and sequence to sequence networks for extractive text summarization[J].Expert Systems with Applications,2023,211:118442
[6]XIAO L,WANG L,HE H,et al.Copy or rewrite:Hybrid summarization with hierarchical reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:9306-9313.
[7]SANGHWAN B,TAEUK K,JIHOON K,et al.Summary Level Training of Sentence Rewriting for Abstractive Summarization[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization.Hong Kong,China:Association for Computational Linguistics,2019:10-20.
[8]CHEN Y C,BANSAL M.Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne,Australia:Association for Computational Linguistics,2018:675-686.
[9]CHEN J,ZHUGE H.Summarization of scientific documents bydetecting common facts in citations[J].Future Generation Computer Systems,2014,32:246-252.
[10]LI P,LU W,CHENG Q.Generating a related work section for scientific papers:an optimized approach with adopting problem and method information [J].Scientometrics,2022,127(8):4397-4417.
[11]CHEN J Q,CAI C X,JIANG X R,et al.Comparative graph-based summarization of scientific papers guided by comparative citations[C]//Proceedings of the 29^th International Conference on Computational Linguistics.Gyeongju,Republic of Korea:International Committee on Computational Linguistics,2022;5978-5988.
[12]MISHRA S K,SAINI N,SAHA S,et al.Scientific document summarization in multi-objective clustering framework [J].Applied Intelligence,2022,52(2):1520-1543.
[13]HE J X,KRYSCINSKI W,MCCANN B,et al.CTRLsum:Towards Generic Controllable Text Summarization[C]//Procee-dings of the 2022 Conference on Empirical Methods in Natural Language Processing.Abu Dhabi,United Arab Emirates:Association for Computational Linguistics,2022:5879-5915.
[14]ED C,ISABELLE A,SEBASTIAN R.A Supervised Approach to Extractive Summarisation of Scientific Papers[C]//Procee-dings of the 21st Conference on Computational Natural Language Learning.Vancouver,Canada:Association for Computational Linguistics,2017:195-205.
[15]BAO G,ZHANG Y.Contextualized rewriting for text summarization [J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2023,31:1624-1635.
[16]ARMAN C,FRANCK D,DOO S K,et al. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics.New Orleans,Louisiana:Association for Computational Linguistics,2018:615-621.
[17]NALLAPATI R,ZHOU B W,DOS SANTOS C,et al.Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond [C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.Berlin,Germany:Association for Computational Linguistics,2016:280-290.
[18]IZ B,KYLE L,ARMAN C.SciBERT:A Pretrained Language Model for Scientific Text[C]//Proceedings of the 2019 Confe-rence on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Hong Kong,China:Association for Computational Linguistics,2019:3615-3620.
[19]NILS R,IRYNA G.Sentence-BERT:Sentence Embeddingsusing Siamese BERT-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing.Hong Kong,China:Association for Computational Linguistics,2019:3982-3992.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.California:Neural Information Processing Systems,2017:6000-6010.
[21]WALEED A,DIRK G,CHANDRA B,et al.Construction of the Literature Graph in Semantic Scholar[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans-Louisiana:Association for Computational Linguistics,2018:84-91.
[22]LIN C Y.ROUGE:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out.Barcelona,Spain:Association for Computational Linguistics,2004:74-81.
[23]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].CoRR,2014,1412:6980.
[24]ERKAN G,RADEV D R.Lexrank:Graph-based lexical centrality as salience in text summarization [J].Journal of Artificial Intelligence Research,2004,22:57-479.
[25]VINYALS O,FORTUNATO M,JAITLY N.Pointer networks [C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:Neural Information Processing Systems,2015:2692-2700.
[26]GU N,ASH E,HAHNLOSER R H.MemSum:Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.Dublin,Ireland:Association for Computational Linguistics,2022:6507-6522.
[27]SEE A,LIU P J,MANNING C D.Get to the point:Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada:Association for Computational Linguistics.2017:1073-1083.
[28]ZHANG H,CAI J,XU J,et al.Pretraining-based natural lan-guage generation for text summarization[C]//Proceedings of the 23rd Conference on Computational Natural Language Lear-ning(CoNLL).Hong Kong:Association for Computational Linguistics,2019:789-797.

Related Articles 15

[1]	MAO Xingjing, WEI Yong, YANG Yurui, JU Shenggen. KHGAS:Keywords Guided Heterogeneous Graph for Abstractive Summarization [J]. Computer Science, 2024, 51(7): 278-286.
[2]	WENG Yu, LUO Haoyu, Chaomurilige, LIU Xuan , DONG Jun, LIU Zheng. CINOSUM:An Extractive Summarization Model for Low-resource Multi-ethnic Language [J]. Computer Science, 2024, 51(7): 296-302.
[3]	DING Yi, WANG Zhongqing. Study on Pre-training Tasks for Multi-document Summarization [J]. Computer Science, 2024, 51(6A): 230300160-8.
[4]	SHI Jiyun, ZHANG Chi, WANG Yuqiao, LUO Zhaojing, ZHANG Meihui. Generation of Structured Medical Reports Based on Knowledge Assistance [J]. Computer Science, 2024, 51(6): 317-324.
[5]	ZHAO Jiangjiang, WANG Yang, XU Yingying, GAO Yang. Extractive Automatic Summarization Model Based on Knowledge Distillation [J]. Computer Science, 2023, 50(6A): 210300179-7.
[6]	ZHANG Xiang, MAO Xingjing, ZHAO Rongmei, JU Shenggen. Study on Extractive Summarization with Global Information [J]. Computer Science, 2023, 50(4): 188-195.
[7]	SUN Kaili, LUO Xudong , Michael Y.LUO. Survey of Applications of Pretrained Language Models [J]. Computer Science, 2023, 50(1): 176-184.
[8]	LI Jian-zhi, WANG Hong-ling, WANG Zhong-qing. Automatic Generation of Patent Summarization Based on Graph Convolution Network [J]. Computer Science, 2022, 49(6A): 172-177.
[9]	GUO Yu-xin, CHEN Xiu-hong. Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement [J]. Computer Science, 2022, 49(6): 313-318.
[10]	LIU Xiao-ying, WANG Huai, WU Jisiguleng. GAN and Chinese WordNet Based Text Summarization Technology [J]. Computer Science, 2022, 49(12): 301-304.
[11]	ZHOU Wei, WANG Zhao-yu, WEI Bin. Abstractive Automatic Summarizing Model for Legal Judgment Documents [J]. Computer Science, 2021, 48(12): 331-336.
[12]	MAO Xiang-ke, HUANG Shao-bin, YU Qin-yong. Graph Based Collaborative Extraction Method for Keywords and Summary from Documents [J]. Computer Science, 2021, 48(10): 44-50.
[13]	FU Ying, WANG Hong-ling, WANG Zhong-qing. Scientific Paper Summarization Using Word-Section Association [J]. Computer Science, 2021, 48(10): 59-66.
[14]	ZHANG Ying, ZHANG Yi-fei, WANG Zhong-qing and WANG Hong-ling. Automatic Summarization Method Based on Primary and Secondary Relation Feature [J]. Computer Science, 2020, 47(6A): 6-11.
[15]	SHU Yun-feng and WANG Zhong-qing. Research on Chinese Patent Summarization Based on Patented Structure [J]. Computer Science, 2020, 47(6A): 45-48.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Generation of Contributions of Scientific Paper Based on Multi-step Sentence Selecting-and-Rewriting Model

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0