计算机科学 ›› 2015, Vol. 42 ›› Issue (10): 198-201.

• 软件与数据库技术 • 上一篇    下一篇

基于数据世系的微博信息管理与检索算法研究

黄庆宇,卢珞先   

  1. 武汉理工大学信息工程学院 武汉430070,武汉理工大学信息工程学院 武汉430070
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61170306),湖北省科技攻关基金项目(2003AA101B05)资助

Provenance Based Information Management Method for Microblog Messages

HUANG Qing-yu and LU Luo-xian   

  • Online:2018-11-14 Published:2018-11-14

摘要: 在微博平台中用户的消息以流的形式按照时间顺序到达系统,对微博数据流的有效管理可以及时地响应用户的查询操作。基于数据库的数据世系思想,提出了一种基于数据世系的微博信息管理方法。首先,根据事件的产生、发展以及变化,将同一社会事件包含的消息定义为数据世系;其次,将微博消息流划分为不同的数据世系,并根据新消息动态地维护数据世系集合;最后,应用数据世系中的文本消息响应用户的查询。实验表明,基于数据世系的微博信息管理方法使用的内存少,运行效率高,可用于微博消息流的实时处理及查询响应工作。

关键词: 数据世系,数据流,微博,信息检索

Abstract: In microblog platform,users’ messages arrive the system in a temporally ordered sequence,and efficient management of microblog streaming data can handle users’ queries timely.Based on provenance of database,a provenance based information management method for microblog messages was proposed.Firstly,the provenance is defined as messages about a common event according to the generation,development and changing of an event.Secondly,the message streaming is divided into different provenances and they are maintained dynamically when a new message comes.Finally,the messages of provenance are used to answer user’s queries.The experiments show that the proposed method is efficient in memory usage and time cost,and can be used to timely response of users’ queries.

Key words: Provenance,Streaming data,Microblog,Information retrieval

[1] Miller G.Social scientists wade into the tweet stream[J].Sci-ence,2011,333(6051):1814-1815
[2] Java A,Song X,Finin T,et al.Why we twitter:understanding microblogging usage and communities[C]∥Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mi-ning and Social Network Analysis.ACM,2007:56-65
[3] Oh C,Sheng O.Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement[C]∥ICIS.2011
[4] Sprenger T O,Tumasjan A,Sandner P G,et al.Tweets andtrades:The information content of stock microblogs[J].EuropeanFinancial Management,2014,20(5):926-957
[5] Agarwal A,Xie B,Vovsha I,et al.Sentiment analysis of twitter data[C]∥Proceedings of the Workshop on Languages in Social Media.2011:30-38
[6] 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848 Zhao Yan-yan,Qin Bing,Liu Ting.Sentiment analysis[J].Journal of Software,2010,1(8):1834-1848
[7] Gaonkar S,Li J,Choudhury R R,et al.Micro-blog:sharing and querying content through mobile phones and social participation[C]∥Proceedings of the 6th International Conference on Mobile Systems,Applications,and Services.ACM,2008:174-186
[8] 刘志明,刘鲁.微博网络舆情中的意见领袖识别及分析[J].系统工程,2011,29(6):8-16 Liu Zhi-ming,Liu Lu.Recognition and Analysis of Opinion Leaders in Microblog Public Opinions[J].Systems Engineering,2011,29(6):8-16
[9] Simmhan Y L,Plale B,Gannon D.A survey of data provenance in e-science[J].ACM Sigmod Record,2005,34(3):31-36
[10] Moreau L.The foundations for provenance on the Web[J].Foundations and Trends in Web Science,2010,2(2/3):99-241
[11] Budak D,Abbadi A E.Information diffusion in social networks:Observing and influencing societal interests[J].PVLDB,4(12):1-5
[12] Leskovec J,Backstrom L,Kleinberg J.Meme-tracking and the dynamics of the news cycle[C]∥Proc.of KDD.2009:497-506
[13] Wu S,Hofman J,Mason W,et al.Who says what to whom on twitter[C]∥Proc.of WWW.2011:705-714
[14] Teevan J,Ramage D,Morris M R.#twittersearch:a comparison of microblog search and Web search[C]∥Proc.of WSDM.2011:35-44
[15] Chen C,Li F,Ooi B C,et al.Ti:An efficient indexing mechanism for real-time search on tweets[C]∥Proc.of SIGMOD.2011:649-660
[16] Ramage D,Dumais S,Liebling D.Characterizing microblogs with topic models[C]∥Proc.of ICWSM.2010
[17] Zhao X,Jiang J,He J,et al.Topical keyphrase extraction from twitter[C]∥Proc.of ACL.2011:379-388
[18] 李波,石慧霞,王毅.一种基于同义词发现的文本扩充算法[J].重庆理工大学学报(自然科学版),2014,28(2):76-81 Li Bo,Shi Hui-xia,Wang Yi.A Text Extension Algorithm Based on Synonymy Discovery[J].Journal of Chongqing University of Technology(Natural Science),2014,28(2):76-81

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!