Computer Science ›› 2022, Vol. 49 ›› Issue (6): 313-318.doi: 10.11896/jsjkx.210400101

• Artificial Intelligence • Previous Articles     Next Articles

Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement

GUO Yu-xin1, CHEN Xiu-hong2   

  1. 1 School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China
    2 Jiangsu Key Laboratory of Media Design and Software Technology,Wuxi,Jiangsu 214122,China
  • Received:2021-04-10 Revised:2021-07-25 Online:2022-06-15 Published:2022-06-08
  • About author:GUO Yu-xin,born in 1997,postgra-duate.Her main research interests include natural language processing and text summarization.
    CHEN Xiu-hong,born in 1964,Ph.D supervisor.His main research interests include pattern recognition and intelligent computing,etc.
  • Supported by:
    Jiangsu Postgraduate Research and Practice Innovation Program(JNKY19_074).

Abstract: Automatic text summarization can help people to filter and identify information quickly,grasp the key content of news,and alleviate the problem of information overload.The mainstream abstractive summarization model is mainly based on the encoder-decoder architecture.In view of the fact that the decoder does not fully consider the text topic information when predicting the target word,and the traditional Word2Vec static word vector cannot solve the polysemy problem,an automatic summarization model for Chinese short news is proposed,which integrates the BERT word embedding representation and topic information enhancing.The encoder combines unsupervised algorithm to obtain text topic information and integrates it into the attention mechanism to improve the decoding effect of the model.At the decoder side,the BERT sentence vector extracted from the BERT pre-trained language model is used as the supplementary feature to obtain more semantic information.Meanwhile,pointer mechanism is introduced to solve the problem of out of vocabulary,and coverage mechanism is used to suppress repetition effectively.Finally,in the training process,reinforcement learning method is adopted to optimize the model for non-differentiable index ROUGE to avoid exposing bias.Experimental results on two datasets of Chinese short news summarization show that the proposed model can significantly improve the ROUGE evaluation index,effectively integrate text topic information,and generate fluent and concise summaries.

Key words: Abstractive summarization, Attention mechanism, BERT, Reinforcement learning, Topic information

CLC Number: 

  • TP391.1
[1] HU X,LIN Y,WANG C.Overview of automatic textsum-mingtechnology[J].Journal of Information,2010,29(8):144-147.
[2] RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:379-389.
[3] PAULUS R,XIONG C,SOCHER R.A deep reinforced modelfor abstractive summarization[J].arXiv:1705.04304,2018.
[4] CHOPRA S,AULI M,RUSH A M.Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:93-98.
[5] NALLAPATI R,ZHOU B W,GULCEHRE C,et al.Abstrac-tive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computatioal Natural Language Learning.Stroudsburg:Association for Computational Linguistics.2016:280-290.
[6] GU J,LU Z,LI H,et al.Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:1631-1640.
[7] ZENG W,LUO W,FIDLER S,et al.Efficient summarizationwith read-again and copy mechanism[J].arXiv:1611.03382,2016.
[8] SEE A,LIU P J,MANNING C D.Get to the point summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meetings of the Association for Computational Linguistics.Stroudsburg,PA:Association for Computational Linguistics,2017:1073-1083.
[9] JONAS G,MICHAEL A,DAVID G,et al.Convolutional se-quence to sequence learning[J].arXiv:1705.03122,2017.
[10] WANG L,YAO J L,TAO Y Z,et al.A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Proceedings of the Twenty-Seventh Inter-national Joint Conference on Artificial Intelligence.2018:4453-4460.
[11] WANG Q S,ZHANG H,LI F.Automatic Summary Generation Method Based on Multidimensional Text Feature[J].Computer Engineering,2020,46(9):110-116.
[12] ILYA S,ORIOLV,QUOCV L.Sequence to sequence learningwith neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112.
[13] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[14] JACOB D,CHANG M,KENTON L,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1801.04805v2,2018.
[15] ASHISH V,NOAM S,NIKI P,et al.Attention is all you need[J].arXiv:1706.03762,2017.
[16] LUHN H P.The Automatic Creation of Literature Abstracts[J].IBM Journal of Research and Development,1958,2(2):159-165.
[17] AIZAWA A.An information-theoretic perspective of tf-idfmeasures[J].Information Processing & Management,2003,39(1):45-65.
[18] HOU L W,HU P,CAO W L.Automatic Chinese abstractive summarization with topical keywords fusion[J].Acta Automa-tica Sinica,2019,45(3):530-539.
[19] WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1998,1(2):270-280.
[20] RENNIE S J,MARCHERET E,MROUEH Y,et al.Self-Critical Sequence Training for Image Captioning[C]//Proceedings of the 2017 Conference of the IEEE Computer Vision and Pattern Recognition.2017:1179-1195.
[21] LIU C,LOWE R,SERBAN I,et al.How NOT To EvaluateYour Dialogue System:An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2016:2122-2132.
[22] HU B,CHEN Q,ZHU F.LCSTS:A Large Scale Chinese Short Text Summarization Dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1967-1972.
[23] MIHALCEA R,TARAU P.TextRank:Bringing Order intoTexts[C]//Conference on Empirical Methods in Natural Language Processing.2004:404-411.
[24] XIE N,LI S,REN H,et al.Abstractive summarization improved by WordNet-based extractive sentences[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2018:404-415.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] LIU Xing-guang, ZHOU Li, LIU Yan, ZHANG Xiao-ying, TAN Xiang, WEI Ji-bo. Construction and Distribution Method of REM Based on Edge Intelligence [J]. Computer Science, 2022, 49(9): 236-241.
[3] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[4] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[5] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[6] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[7] SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256.
[8] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[9] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[10] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[11] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[12] YUAN Wei-lin, LUO Jun-ren, LU Li-na, CHEN Jia-xing, ZHANG Wan-peng, CHEN Jing. Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning [J]. Computer Science, 2022, 49(8): 191-204.
[13] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[14] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[15] XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!