Computer Science ›› 2020, Vol. 47 ›› Issue (6): 74-78.doi: 10.11896/jsjkx.190600006

• Databωe & Big Data & Data Science • Previous Articles     Next Articles

Chinese Short Text Summarization Generation Model Based on Semantic-aware

NI Hai-qing, LIU Dan, SHI Meng-yu   

  1. Research Institute of Electronic Science and Technology,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Received:2019-06-01 Online:2020-06-15 Published:2020-06-10
  • About author:NI Hai-qing,born in 1994,postgradua-te.His main research interests include text summarization and so on.
    LIU Dan,born in 1969,Ph.D,associate professor.His main research interests include network and system security,cloud computing and data processing.

Abstract: The text summary generation technology can summarize the key information from the massive data and effectively solve the problem of information overload.At present,the sequence-to-sequence model is widely used in the field of English text abstraction generation,but there is no in-depth study on this model in the field of Chinese text abstraction.In the conventional sequence-to-sequence model,the decoder applies the hidden state of each word output by the encoder as the overall semantic information through the attention mechanism,nevertheless the hidden state of each word which encoder outputs only in consideration of the front and back words of current word,which results in the generated summary missing the core information of the source text.To solve this problem,a semantic-aware based Chinese short text summarization generation model called SA-Seq2Seq is proposed,which uses the sequence-to-sequence model with attention mechanism.The model SA-Seq2Seq applies the pre-training model called BERT to introduce source text in the encoder so that each word contains the overall semantic information and uses gold summary as the target semantic information in the decoder to calculate the semantic inconsistency loss,thus ensuring the semantic integrity of the generated summary.Experiments are carried out on the dataset using the Chinese short text summary dataset LCSTS.The experimental results show that the model SA-Seq2Seq on the evaluation metric ROUGE is significantly improved compared to the benchmark model,and its ROUGE-1,ROUGE-2 and ROUGE-L scores increase by 3.4%,7.1% and 6.1% respectively in the dataset that is processed based on character and increase by 2.7%,5.4% and 11.7% respectively in the dataset that is processed based on word.So the SA-Seq2Seq model can effectively integrate Chinese short text and ensure the fluency and consistency of the generated summary,which can be applied to the Chinese short text summary generation task.

Key words: Chinese short text summarization, Sequence to sequence model, Attention mechanism, Pre-training model, Semantic aware

CLC Number: 

  • TP391.1
[1]NETO J L,FREITAS A A,KAESTNER C A A.Automatic Text Summarization Using a Machine Learning Approach[C]//16th Brazilian Symposium on Artificial Intelligence.2002:11-14.
[2]CHOPRA S,AULI M,RUSH A M.Abstractive Sentence Summarization with Attentive Recurrent Neural Networks[C]//Conference of the North American Chapter of the Association for Computational Linguistics.2016:93-98.
[3]SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[C]//Advances in Neural Information Processing Systems 27.2014:3104-3112.
[4]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[C]//3rd International Conference on Learning Representations.2015.
[5]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421.
[6]PANG C,YIN C H.Chinese Text Summarization Based on Classification[J].Computer Science,2018,45(1):144-147,178.
[7]WU R S,WANG H L,WANG Z Q, et al.Short Text Summary Generation with Global Self-Matching Mechanism[J/OL].Journal of Software.
[8]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed Representations of Words and Phrases and their Compositiona-lity[C]//27th Annual Conference on Neural Information Processing Systems 2013.2013:3111-3119.
[9]PETERS M E,NEUMANN M,IYYER M,et al.DeepContextuaized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:2227-2237.
[10]RADFORD A,NARASIMHAN K,et al.Improving language understanding by generative pre-training[EB/OL].
[11]RADFORD A,WU J,et al.Language Models are Unsupervised Multitask Learners[EB/OL].
[12]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].CoRR,2018,abs/1810.04805.
[13]KRYS'CINSKI W,PAULUS R,XIONG C,et al.Improving Abstraction in Text Summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:1808-1817.
[14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[15]CHO K,VAN MERRIENBOER B,BAHDANAU D,et al.On the Properties of Neural Machine Translation:Encoder--Decoder Approaches[C]// Proceedings of SSST@EMNLP 2014.2014:103-111.
[16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All you Need[C]//Advances in Neural Information Processing Systems.2017:6000-6010.
[17]WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1989,1(2):270-280.
[18]HU B,CHEN Q,ZHU F,et al.LCSTS:A Large Scale Chinese Short Text Summarization Dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1967-1972.
[19]LIN C.ROUGE:A Package for Automatic Evaluation of Summaries[C]//Text Summarization Branches Out:Proceedings of the ACL-04 Workshop.2004:74-81.
[20]KINGMA D P,BA J.Adam:A method for stochastic optimization[C]//3rd International Conference on Learning Representations.2014.
[21]NETO J L,FREITAS A A,KAESTNER C A,et al.Automatic Text Summarization Using a Machine Learning Approach[C]//Brazilian Symposium on Artificial Intelligence.2002:205-215.
[1] ZHAO Jia-qi, WANG Han-zheng, ZHOU Yong, ZHANG Di, ZHOU Zi-yuan. Remote Sensing Image Description Generation Method Based on Attention and Multi-scale Feature Enhancement [J]. Computer Science, 2021, 48(1): 190-196.
[2] LIU Yang, JIN Zhong. Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism [J]. Computer Science, 2021, 48(1): 197-203.
[3] WANG Rui-ping, JIA Zhen, LIU Chang, CHEN Ze-wei, LI Tian-rui. Deep Interest Factorization Machine Network Based on DeepFM [J]. Computer Science, 2021, 48(1): 226-232.
[4] WANG Run-zheng, GAO Jian, HUANG Shu-hua, TONG Xin. Malicious Code Family Detection Method Based on Knowledge Distillation [J]. Computer Science, 2021, 48(1): 280-286.
[5] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[6] ZHAO Wei, LIN Yu-ming, WANG Chao-qiang, CAI Guo-yong. Opinion Word-pairs Collaborative Extraction Based on Dependency Relation Analysis [J]. Computer Science, 2020, 47(8): 164-170.
[7] YUAN Ye, HE Xiao-ge, ZHU Ding-kun, WANG Fu-lee, XIE Hao-ran, WANG Jun, WEI Ming-qiang, GUO Yan-wen. Survey of Visual Image Saliency Detection [J]. Computer Science, 2020, 47(7): 84-91.
[8] LIU Yan, WEN Jing. Complex Scene Text Detection Based on Attention Mechanism [J]. Computer Science, 2020, 47(7): 135-140.
[9] YU Yi-lin, TIAN Hong-tao, GAO Jian-wei and WAN Huai-yu. Relation Extraction Method Combining Encyclopedia Knowledge and Sentence Semantic Features [J]. Computer Science, 2020, 47(6A): 40-44.
[10] HUANG Yong-tao, YAN Hua. Scene Graph Generation Model Combining Attention Mechanism and Feature Fusion [J]. Computer Science, 2020, 47(6): 133-137.
[11] ZHANG Zhi-yang, ZHANG Feng-li, CHEN Xue-qin, WANG Rui-jin. Information Cascade Prediction Model Based on Hierarchical Attention [J]. Computer Science, 2020, 47(6): 201-209.
[12] DENG Yi-jiao, ZHANG Feng-li, CHEN Xue-qin, AI Qing, YU Su-zhe. Collaborative Attention Network Model for Cross-modal Retrieval [J]. Computer Science, 2020, 47(4): 54-59.
[13] ZHANG Peng-fei, LI Guan-yu, JIA Cai-yan. Truncated Gaussian Distance-based Self-attention Mechanism for Natural Language Inference [J]. Computer Science, 2020, 47(4): 178-183.
[14] ZHANG Yi-fei,WANG Zhong-qing,WANG Hong-ling. Product Review Summarization Using Discourse Hierarchical Structure [J]. Computer Science, 2020, 47(2): 195-200.
[15] LIU Chong, DU Jun-ping. Financial Data Prediction Method Based on Deep LSTM and Attention Mechanism [J]. Computer Science, 2020, 47(12): 125-130.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .