Computer Science ›› 2016, Vol. 43 ›› Issue (6): 240-247.doi: 10.11896/j.issn.1002-137X.2016.06.048

Previous Articles     Next Articles

Improved TextRank-based Method for Automatic Summarization

YU Shan-shan, SU Jin-dian and LI Peng-fei   

  • Online:2018-12-01 Published:2018-12-01

Abstract: The canonical TextRank usually only considers the similarity between sentences in the processes of automatic summarization and neglects the information of text structures and sentence contexts.To overcome these disadvantages,we proposed an improved method on the basis of TextRank,called iTextRank,by incorporating the structure information of Chinese texts.iTextRank takes some important contexts and semantic information into consideration,including titles,paragraphs,special sentences,positions and lengths of sentences,when building the network diagram of TextRank,computing the similarities of sentences and adjusting the weights of the nodes.We also applied iTextRank into the automatic summarization of Chinese texts and analyzed its time complexities.Finally,some experiments were done.The results prove that iTextRank has higher accuracy rate and lower recall rate compared with canonical TextRank.

Key words: Chinese texts,Automatic summarization extraction,TextRank,Article discourse,Unsupervised learning methods

[1] Wang Ji-cheng,Wu Gang-shan,Zhou Yuan-yuan,et al.Research on Automatic Summarization of Web Document Guided by Discourse[J].Journal of Computer Research and Development,2003,40(3):398-405 (in Chinese) 王继成,武港山,周源远,等.一种篇单结构指导的中文Web文档自动摘要方法[J].计算机研究与发展,2003,40(3):398-405
[2] Zhang Qi,Huang Xuan-jing,Wu Li-de.A New Method for Calculating Similarity Between Sentences and Application on Automatic Text Summarization[J].Journal of Chinese Information Processing,2005,19(2):93-99 (in Chinese) 张奇,黄萱菁,吴立德.一种新的句子相似度度量及其在文本自动摘要中的应用[J].中文信息学报,2005,19(2):93-99
[3] Ji Wen-qian,Li Zhou-jun,Chao Wen-han,et al.Automatic Abstracting System Based on Improved LexRank Algorithm[J].Computer Science,2010,37(5):151-154 (in Chinese) 纪文倩,李舟君,巢文涵,等.一种基于LexRank算法的改进的自动文摘系统[J].计算机科学,2010,37(5):151-154
[4] Luo Wen-jun,Ma Hui-fang,He Qing,et al.Leveraging Entropy and Relevance for Document Summarization[J].Journal of Chinese Information Processing,2011,25(5):9-16 (in Chinese) 罗文娟,马慧芳,何清,等.权衡熵和相关度的自动摘要技术研究[J].中文信息学报,2011,25(5):9-16
[5] Li Ran,Zhang Hua-ping,Zhao Yan-ping,et al.Automatic Text Summarization Research Based on Topic Model and Information Entropy[J].Computer Science,2014,41(11A):298-300,332 (in Chinese) 李然,张华平,赵燕平,等.基于主题模型与信息熵的中文文档自动摘要技术研究[J].计算机科学,2014,41(11A):298-300,332
[6] Mihalcea R,Tarau P.TextRank:Bringing Order Into Texts,2004[C]∥Proceedings of EMNLP 2004.Barcelona:ACM,2004:404-411
[7] Blanco R,Lioma C.Random Walk Term Weighting for Information Retrieval[C]∥Proc.of the 30th SIGIR.New York:ACM Press,2007:829-830
[8] Blanco R,Lioma C.Graph-based Term Weighting for Information Retrieval[J].Information Retrieval,2012,15(2):54-92
[9] Lu Wei,Chen Qi-kai.An Information Retrieval Model Based on Weighted Graph and Sentence[J].Journal of the China Society for Scientific and Technical Information,2013,32(8):797-804 (in Chinese) 陆伟,程齐凯.一种基于加权网络和句子窗口方案的信息检索模型[J].情报学报,2013,32(8):797-804
[10] Wan X,Yang J,Xiao J.Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction[C] ∥Proc.of the 45th Annual Meeting of the Association of Computational Linguistics.Czech Republic:Association for Computational Linguistics,2007:552-559
[11] Yang Jie,Ji Duo,Cai Dong-feng,et al.Keyword Extraction in Multi-Document Based on TextRank Technology[C]∥Proc.of NCIRCS’2008(The 2nd Vol.).2008 (in Chinese) 杨洁,季铎,蔡东风,等.基于TextRank的多文档关键词抽取技术[C]∥第四届全国信息检索与内容安全学术会议论文集(上).2008
[12] Li Peng,Wang Bin,Shi Zhi-wei,et al.Tag-TextRank:A Webpage Keyword Extraction Method Based on Tags[J].Journal of Computer Research and Development,2012,49(11):2344-2351(in Chinese) 李鹏,王斌,石志伟,等.Tag-TextRank:一种基于Tag的网页关键词抽取方法[J].计算机研究与发展,2012,49(11):2344-2351
[13] Luhn H P.The Automatic Creation of Literature Abstracts [J].IBM Journal of Research and Development,1958,2(8):159-165
[14] Baxendale P E.Machine-made Index for Technical Literature-an Experiment[J].IBM Journal of Research and Development,1958,2(4):354-361
[15] Salton G,Wong A,Yan C S.A Vector Space Model for Automatic Indexing[J].Communication of the ACM,1995(18):613-620

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!