计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 261-265.doi: 10.11896/j.issn.1002-137X.2016.09.052

• 人工智能 • 上一篇    下一篇

基于Lex-PageRank的微博摘要优化方法

朱明峰,叶施仁,叶仁明   

  1. 常州大学信息工程学院 常州213164,常州大学信息工程学院 常州213164,常州大学信息工程学院 常州213164
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61272367)资助

Extract Summarization Method Based on Lex-PageRank in Chinese Microblog

ZHU Ming-feng, YE Shi-ren and YE Ren-ming   

  • Online:2018-12-01 Published:2018-12-01

摘要: 当前,由于全民自媒体兴起而引发了巨大的舆情危机,如何高效快速地从海量的碎片化信息中发现热点并抽取实用信息成为一项重大的挑战。在此背景下,提出一种基于Lex-PageRank的微博摘要优化方法,在该方案中,以聚类结果作为实验数据,从微博影响力周期的时间特性和权重属性考虑,提出改进的Lex-PageRank算法,从聚类结果中抽取若干文本组织生成摘要。在新浪微博数据基础上进行的对比实验表明,本方案可以有效地从大量文本中提取出关键信息。

关键词: 微博,时间特性,权重属性,Lex-PageRank算法

Abstract: In recent years,since the rise of personal-media caused a huge public opinion crisis,how to discover hot topics from the fragmentation of the mass microblogging information and extract useful information has become a major challenge.In this background,we proposed an extract summarization method based on improved Lex-PageRank algorithm.In this program,we make a simple clustering and use these clustering results as experimental data.With due consideration for the time characteristics of microblog influence cycle and weight attribute,the improved Lex-PageRank algorithm is combined with MMR algorithm to get many texts from clustering results to generate a summary.The experiment based on Sina Weibo indicates that our method can extract critical information effectively from mass texts.

Key words: Microblog,Time characteristics,Weight attribute,Lex-PageRank algorithm

[1] Kwak H,Lee C,Park H,et al.What is Twitter,a social network or a news media[C]∥Proceedings of the 19th International Conference on World Wide Web.ACM,2010:591-600
[2] Brandow R,Mitze K,Rau L F.Automatic condensation of electronic publication by sentence selection [J].Information Processing Manage,1995,31(5):575-685
[3] Luhn H P.The Automatic Creation of Literature Abstracts [J].IBM Journal of Research and Development,1958,2(2):159-165
[4] Cao Yang,Cheng Ying,Pei Lei.A Review on Machine Learning Oriented Automatic Summarization [J].Library and Information Service,2014,58(18):122-130(in Chinese) 曹洋,成颖,裴雷.基于机器学习的自动文摘研究综述[J].图书情报工作,2014,58(18):122-130
[5] Han Yong-feng,Xu Xu-yang,Li Bi-cheng,et al.Web NewsMulti-document Summarization Based on Event Extraction [J].Journal of Chinese Information Processing,2012,6(1):58-66(in Chinese) 韩永峰,许旭阳,李弼程,等.基于事件抽取的网络新闻多文档自动摘要[J].中文信息学报,2012,6(1):58-66
[6] Hu M,Sun A,Lim E P.Comments-oriented blog summarization by sentence extraction[C]∥Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Mana-gement.ACM CIKM,2007:901-904
[7] Chen Yan-min,Wang Xiao-long,Liu Yuan-chao,et al.Automatic Text Summarization Based on Topic and Content [J].Computer Engineering and Applications,2004,3(5):11-14(in Chinese) 陈燕敏,王晓龙,刘远超,等.一种基于文章主题和内容的自动摘要方法[J].计算机工程与应用,2004,3(5):11-14
[8] Xie Hao,Sun Wei.Paragraph-Sentence Mutual ReinforcementBased Automatic Summarization Algorithm [J].Computer Science,2013,0(11A):246-250(in Chinese) 谢浩,孙伟.基于段落-句子互增强的自动文摘算法[J].计算机科学,2013,0(11A):246-250
[9] Yang Chang-chun,Zhou Meng.An Improved Hot Topic Detection Method for Microblog Based on CURE Algorithm[J].Computer Simulcation,2013,30(11):383-387(in Chinese) 杨长春,周猛.基于改进CURE算法的微博热点话题发现[J].计算机仿真,2013,30(11):383-387
[10] Ammar M B,Neji M,Alimi A M.The integration of an emotio-nal system in the Intelligent Tutoring System[C]∥The 3rd ACS/IEEE International Conference on Computer Systems and Applications.2005:145
[11] Langville A N,Meyer C D.Deeper inside pagerank[J].Internet Mathematics,2004,1(3):335-380
[12] Jin X,Deng Y F,Zhong Y X.Mixture feature selection strategy applied in cancer classificate from gene expression[D].Shanghai:IEEE Press,2005:4807-4809

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!