Computer Science ›› 2020, Vol. 47 ›› Issue (6): 74-78.doi: 10.11896/jsjkx.190600006

Special Issue: Big Data & Data Scinece

• Databωe & Big Data & Data Science • Previous Articles     Next Articles

Chinese Short Text Summarization Generation Model Based on Semantic-aware

NI Hai-qing, LIU Dan, SHI Meng-yu   

  1. Research Institute of Electronic Science and Technology,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Received:2019-06-01 Online:2020-06-15 Published:2020-06-10
  • About author:NI Hai-qing,born in 1994,postgradua-te.His main research interests include text summarization and so on.
    LIU Dan,born in 1969,Ph.D,associate professor.His main research interests include network and system security,cloud computing and data processing.

Abstract: The text summary generation technology can summarize the key information from the massive data and effectively solve the problem of information overload.At present,the sequence-to-sequence model is widely used in the field of English text abstraction generation,but there is no in-depth study on this model in the field of Chinese text abstraction.In the conventional sequence-to-sequence model,the decoder applies the hidden state of each word output by the encoder as the overall semantic information through the attention mechanism,nevertheless the hidden state of each word which encoder outputs only in consideration of the front and back words of current word,which results in the generated summary missing the core information of the source text.To solve this problem,a semantic-aware based Chinese short text summarization generation model called SA-Seq2Seq is proposed,which uses the sequence-to-sequence model with attention mechanism.The model SA-Seq2Seq applies the pre-training model called BERT to introduce source text in the encoder so that each word contains the overall semantic information and uses gold summary as the target semantic information in the decoder to calculate the semantic inconsistency loss,thus ensuring the semantic integrity of the generated summary.Experiments are carried out on the dataset using the Chinese short text summary dataset LCSTS.The experimental results show that the model SA-Seq2Seq on the evaluation metric ROUGE is significantly improved compared to the benchmark model,and its ROUGE-1,ROUGE-2 and ROUGE-L scores increase by 3.4%,7.1% and 6.1% respectively in the dataset that is processed based on character and increase by 2.7%,5.4% and 11.7% respectively in the dataset that is processed based on word.So the SA-Seq2Seq model can effectively integrate Chinese short text and ensure the fluency and consistency of the generated summary,which can be applied to the Chinese short text summary generation task.

Key words: Attention mechanism, Chinese short text summarization, Pre-training model, Semantic aware, Sequence to sequence model

CLC Number: 

  • TP391.1
[1]NETO J L,FREITAS A A,KAESTNER C A A.Automatic Text Summarization Using a Machine Learning Approach[C]//16th Brazilian Symposium on Artificial Intelligence.2002:11-14.
[2]CHOPRA S,AULI M,RUSH A M.Abstractive Sentence Summarization with Attentive Recurrent Neural Networks[C]//Conference of the North American Chapter of the Association for Computational Linguistics.2016:93-98.
[3]SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[C]//Advances in Neural Information Processing Systems 27.2014:3104-3112.
[4]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[C]//3rd International Conference on Learning Representations.2015.
[5]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421.
[6]PANG C,YIN C H.Chinese Text Summarization Based on Classification[J].Computer Science,2018,45(1):144-147,178.
[7]WU R S,WANG H L,WANG Z Q, et al.Short Text Summary Generation with Global Self-Matching Mechanism[J/OL].Journal of Software.https://doi.org/10.13328/j.cnki.jos.005850.
[8]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed Representations of Words and Phrases and their Compositiona-lity[C]//27th Annual Conference on Neural Information Processing Systems 2013.2013:3111-3119.
[9]PETERS M E,NEUMANN M,IYYER M,et al.DeepContextuaized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:2227-2237.
[10]RADFORD A,NARASIMHAN K,et al.Improving language understanding by generative pre-training[EB/OL].https://s3-us-west-2.amazonaws.com/openaiassets/research-covers/language-unsupervised/languageunderstandingpaper.pdf.
[11]RADFORD A,WU J,et al.Language Models are Unsupervised Multitask Learners[EB/OL].https://blog.openai.com/better-language-models/.
[12]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].CoRR,2018,abs/1810.04805.
[13]KRYS'CINSKI W,PAULUS R,XIONG C,et al.Improving Abstraction in Text Summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:1808-1817.
[14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[15]CHO K,VAN MERRIENBOER B,BAHDANAU D,et al.On the Properties of Neural Machine Translation:Encoder--Decoder Approaches[C]// Proceedings of SSST@EMNLP 2014.2014:103-111.
[16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All you Need[C]//Advances in Neural Information Processing Systems.2017:6000-6010.
[17]WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1989,1(2):270-280.
[18]HU B,CHEN Q,ZHU F,et al.LCSTS:A Large Scale Chinese Short Text Summarization Dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1967-1972.
[19]LIN C.ROUGE:A Package for Automatic Evaluation of Summaries[C]//Text Summarization Branches Out:Proceedings of the ACL-04 Workshop.2004:74-81.
[20]KINGMA D P,BA J.Adam:A method for stochastic optimization[C]//3rd International Conference on Learning Representations.2014.
[21]NETO J L,FREITAS A A,KAESTNER C A,et al.Automatic Text Summarization Using a Machine Learning Approach[C]//Brazilian Symposium on Artificial Intelligence.2002:205-215.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[3] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[4] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[5] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[6] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[9] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[10] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[11] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[12] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[13] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[14] MENG Yue-bo, MU Si-rong, LIU Guang-hui, XU Sheng-jun, HAN Jiu-qiang. Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism [J]. Computer Science, 2022, 49(7): 142-147.
[15] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!