基于分类的中文文本摘要方法

doi:10.11896/j.issn.1002-137X.2018.01.024

摘要/Abstract

摘要： 自动文本摘要是自然语言处理领域中一项重要的研究内容,根据实现方式的不同其分为摘录式和理解式,其中理解式文摘是基于不同的形式对原始文档的中心内容和概念的重新表示,生成的文摘中的词语无需与原始文档相同。提出了一种基于分类的理解式文摘模型。该模型将基于递归神经网络的编码-解码结构与分类结构相结合,并充分利用监督信息,从而获得更多的摘要特性；通过在编码-解码结构中使用注意力机制,模型能更精确地获取原文的中心内容。模型的两部分可以同时在大数据集下进行训练优化,训练过程简单且有效。所提模型表现出了优异的自动摘要性能。

关键词: 递归神经网络,注意力机制,文本摘要,文本分类

Abstract: Automatic text summarization is an important content in natural language processing.According to different implementation ways,it can be classified into extractive summarization and abstractive summarization.Abstractive summarization consists of ideas or concepts which are taken from the original document but are re-interpreted and shown in a different form,the aspects of which may not appear as part of the original document.This paper proposed an abstractive model with classifier.The model combines encoder-decoder structure based on recurrent neural networks with classifier to use supervised information more sufficiently and get more abstract features.However,encoder-decoder structure and classifier can easily be trained end-to-end and scale a large amount of training data at the same time.The model obtains good performance of text summarization and text classification.

Key words: Recurrent neural networks,Attention mechanism,Text summarization,Text classification

庞超,尹传环. 基于分类的中文文本摘要方法[J]. 计算机科学, 2018, 45(1): 144-147. https://doi.org/10.11896/j.issn.1002-137X.2018.01.024

PANG Chao and YIN Chuan-huan. Chinese Text Summarization Based on Classification[J]. Computer Science, 2018, 45(1): 144-147. https://doi.org/10.11896/j.issn.1002-137X.2018.01.024

参考文献

[1] GAMBHIR M,GUPTA V.Recent automatic text summarization techniques:a survey [J].Artificial Intelligence Review,2017,47(1):1-66.
[2] LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural MachineTranslation[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:ACL, 2015:1412-1421.
[3] GRAVES A,MOHAMNED A R,HINTON G.Speech recognition with deep recurrent neural networks IEEE[C]∥International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:6645-6649.
[4] GAMBHIR M,GUPTA V.Recent automatic text summariza-tion techniques:a survey[J].Artificial Intelligence Review,2017,47(1):1-66.
[5] RUSH A M,CHOPRA S,WESTON J.A Neural AttentionModel for Abstractive Sentence Summarization [C]∥Procee-dings of NAACL.2016.
[6] BENGIO Y,SCHWENK H,SENCAL J,et al.Neural Probabilistic Language Models[J].Journal of Machine Learning Research,2006,3(6):1137-1155.
[7] HU B,CHEN Q,ZHU F.LCSTS:A Large Scale Chinese Short Text Summarization Dataset [C]∥Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Proces-sing.Lisbon:ACL,2015:1967-1972.
[8] LOPYREV K.Generating News Headlines with Recurrent Neural Networks[J].Computer Science,2015.
[9] CHOPRA S,AULI M,RUSH A M.Abstractive SentenceSummarization with Attentive Recurrent [C]∥Neural Networks Conference of the North American Chapter of the Associa-tion for Computational Linguistics:Human Language Technolo-gies.2016:93-98.
[10] NALLAPATI R,ZHOU B,SANTOS C N D,et al(1)Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond[J].CoNLL,2016,1(1):280-290.
[11] SURHONE L M,TENNOE M T,HENSSONOW S F.LongShort Term Memory [C]∥Betascript Publishing.2010.
[12] CHO K,VAN M B,GULCEHRE C,et al(1)Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].Computer Science,2014,1(1):43-66.
[13] SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[J].Advances in Neural Information Processing Systems,2014,4:3104-3112.
[14] BENGIO S,VINYALS O,JAITLY N,et al(1)Scheduled sampling for sequence prediction with recurrent Neural networks[C]∥International Conference on Neural Information Processing Systems.MIT Press,2015:1171-1179.
[15] BAHDANNAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[C]∥Procee-dings of Neural Information Processing Systems 2014.NIPS,2014.
[16] VINYALS O,KAISER L,KOO T,et al(1)Grammar as a foreign language[J].Eprint Arxiv,2014,1(1):2773-2781.
[17] LIN C Y,HOVY E.Automatic evaluation of summaries usingN-gram co-occurrence statistics[C]∥Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Association for Computational Linguistics,2003:71-78.
[18] Google.Tensorflow(Version 1.2).http://www.tensorflow.org.
[19] KOEHN P,HOANG H,BIRCH A,et al.Moses:Open sourcetoolkit for statistical machine translation[C]∥Proceedings of ACL.2007:177-180.
[20] CHOPRA S,AULI M,RUSH A M,et al(1)Abstractive sentence summarization with attentive recurrent neural networks[C]∥Proceedings of NAACL.2016.
[21] ELMAN J L.Finding structure in time[J].Cognitive Science,1990,14(2):179-211.
[22] GRA VES A,WAYNE G,DANIHELKA I.Neural Turing Ma chines[J].Computer Science,2014,1(1):89-95.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed