计算机科学 ›› 2016, Vol. 43 ›› Issue (8): 223-228.doi: 10.11896/j.issn.1002-137X.2016.08.045

• 人工智能 • 上一篇    下一篇

基于概率生成模型的微博话题传播群体划分方法

陈静,刘琰,王煦中   

  1. 数学工程与先进计算国家重点实验室 郑州450001,数学工程与先进计算国家重点实验室 郑州450001,数学工程与先进计算国家重点实验室 郑州450001
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61309007),国家863计划(2012AA012902)资助

Group Partition in Topic-related Microblogging Spreading Based on Probability Generation Model

CHEN Jing, LIU Yan and WANG Xu-zhong   

  • Online:2018-12-01 Published:2018-12-01

摘要: 事件以话题形式在微博中迅速传播,并能够产生巨大的影响力。因此,对 参与 话题传播过程的用户进行分析以及发现具有不同主题兴趣情感倾向性的群体受到政府和企业的广泛关注。现阶段,绝大多数应用到微博的群体发现算法都是从单个用户出发,仅考虑了用户社会联系,与用户共享内容相隔离,其群体发现的结果不具有语义信息。少数算法综合了用户社会联系与内容,却忽略了微博本身的结构特性。因此从微博话题的角度出发,综合考虑话题传播过程中的用户交互、微博文本内容以及情感极性,同时结合用户的行为信息,提出了一个基于概率生成模型的微博话题传播群体划分方法BP-STG。采用吉布斯抽样对模型进行推导,不仅能够挖掘出具有不同主题倾向性的群体,同时还能够挖掘出群体的情感倾向分布以及用户在群体中的活跃度及其行为表现。此外,模型还能够推广到许多带有社交网络性质的媒体中。在获取的新浪微博两个话题数据集上的实验表明,BP-STG模型不仅能够有效地对微博话题传播群体进行划分,而且能够发现群体内部活跃用户以及用户在群体中的行为模式。

关键词: 微博话题,概率生成模型,群体划分,情感元素,行为模式

Abstract: Event can spread rapidly in the form of topic microblog and make enormous influence.Therefore,the analysis for the users and discovering groups with different interesting and sentiments in the topic discussion obtain the concern of the government and enterprises.The generated content and relationship between the users are often separated in the current methods on community detection,which have no semantic information.Though some methods have combined the two factors,they fail to take account of the behavior information and sentiment information which exist in microblog,and they are not well to mine the groups in the microblog topic discussion.We proposed a group partition model called BP-STG which takes the text information,social contacts,text sentiment information and the users’ behavior into consideration.We presented a Gibbs sampling implementation for inference of our model,mining only different interest groups,but also the sentiment distribution and participants’ activeness and behavior information in a group.Besides,our model can be extended to many texts associated with a group of people such as E-mails and forum posts.Experimental results on actual dataset show that BP-STG model can offer an effective solution to group partition in topic-related microblogging spreading and provide more meaningful semantic information than the state-of-the-art model.

Key words: Microblogging topic,Probability generation model,Groups partition,Sentiment information,Behavior pattern

[1] Girvan M,Newman M.Community structure in social and biological networks[J].PNAS,2002,99(12):7821-78266
[2] Palla G,Barabasi A,Vicsek T.Quantifying social group evolution[J].Nature,2007,446(7136):664-667
[3] Blei D M,Ng A Y,Jordan M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022
[4] Steyvers M,Smyth P,Rosen-Zvi M,et al.Probabilistic author-topic models for information discovery[C]∥Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2004:306-315
[5] Zhou D,Manavoglu E,Li J,et al.Probabilistic models for discoveringe-communities[C]∥WWW.2006:173-182
[6] Pathak N,DeLong C,Banerjee A,et al.Social topic models for community extraction[C]∥The 2nd SNA-KDD Workshop.2008
[7] Sachan M,Contractor D,Faruquie T,et al.Using content and interactions for discovering communities in social networks[C]∥WWW.2012:331-340
[8] Yang T,Jin R,Chi Y,et al.Combining link and content for community detection:discriminative approach[C]∥KDD.2009:927-936
[9] Zhou W,Jin H,Liu Y.Community discovery and profiling with social messages[C]∥KDD.2012:388-396
[10] Yang B,Manandhar S.Stc:A joint sentiment-topic model for community identification[M]∥Trends and Applications in Knowledge Discovery and Data Mining.Springer International Publishing,2014:535-548
[11] Ding Zhao-yun,Jia Yan,Zhou Bin.Survey of Data Mining for Microblogs[J].Computer Research and Development,2014,51(4):691-706(in Chinese) 丁兆云,贾焰,周斌.微博数据挖掘研究综述[J].计算机研究与发展,2014,51(4):691-706
[12] Steyvers M,Griffiths T.Probabilistic topic models[J].Handbook of Latent Semantic Analysis,2007,427(7):424-440
[13] Chen Xiao-dong.Research and Sentiment Dictionary based Emotional Tendency Analysis of Chinese Microblog[D].Wuhan:Huazhong University of Science & Technology,2012(in Chinese) 陈晓东.基于情感词典的中文微博情感倾向分析研究[D].武汉:华中科技大学,2012

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!