计算机科学 ›› 2013, Vol. 40 ›› Issue (4): 127-130.

• 信息安全 • 上一篇    下一篇

构建微博用户兴趣模型的主题模型的分析

陈文涛,张小明,李舟军   

  1. 北京航空航天大学计算机学院北京100191;北京航空航天大学计算机学院北京100191;北京航空航天大学计算机学院北京100191
  • 出版日期:2018-11-16 发布日期:2018-11-16

Analysis of Topic Models on Modeling MicroBlog User Interestingness

CHEN Wen-tao,ZHANG Xiao-ming and LI Zhou-jun   

  • Online:2018-11-16 Published:2018-11-16

摘要: 分析了不同的主题模型,通过实验比较了3种主题模型构建的微博用户兴趣模型的性能。实验结果表明:TwitterLDA适用于新文档或新用户的预测,AuthorLDA产生的主题具有较高的区分度,而UserLDA和AuthorLDA能更好地反映出用户的社交网络关系。上述工作为进一步研究主题模型如何应用于微博的个性化信息推荐、情感分析和话题检测与跟踪等文本挖掘应用奠定了基础。

关键词: 主题模型,用户兴趣,个性化服务

Abstract: This paper analysed different topic models,and compared three extended topic models’ performance on mo-deling microblog user interestingness via three experiments.Experimental results show that TwitterLDA can apply to predict words on new unseen docuemnts and users,that the topics generated by AuthorLDA have a higher degree of differentiation,and that UserLDA and AuthorLDA can better reflect the users’ relationships in real social network.The work in this paper lays the foundation for further studying how the topic model is applied to the text mining applications of microblogs such as personalized recommendation,sentiment analysis and topic detection and tracking.

Key words: Topic model,User interest,Personalized service

[1] Blei D M,Lafferty J.Text Mining:Theory and Applications[M].Chapter Topic Models,Taylor and Francis,London,2009
[2] Blei D M,Ng A Y,Jordan M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3(4/5):993-1022
[3] Steyvers M,Griffiths T.Probabilistic Topic Models[M].Latent Semantic Analysis:A Road to Meaning,Laurence Erlbaum,2005
[4] Heinrich G.Parameter estimation for text analysis[R].Technical report.http://www.arbylon.net/publications/textest,Version 2,2008
[5] Koller D,Friedman N.Probabilistic Graphical Models:Principles and Techniques[M].MIT Press,2009
[6] Zhao Xin,Jiang Jing,Weng Jian-shu,et al.Comparing Twitter and traditional media using topic models[C]∥Proceedings of the 33rd European Conference on Information Retrieval.Springer-Verlag Berlin,Heidelberg,2011:338-349
[7] Weng Jian-shu,Lim E-P,Jiang Jing,et al.TwitterRank:finding topic-sensitive influential twitterers[C]∥Proceedings of the 3th ACM International Conference on Web Search and Data Mining.New York City,NY,USA,2010:261-270
[8] Pinheiro E,Bianchini R,Carrera E,et al.Load balancing and unbalancing for power and performance in cluster-based systems [R].DCS-TR-440.Department of Computer Science,Rutgers University,May 2001
[9] Gandhi A,Harchol-Balter M,Kozuch M A.The case for sleep states in servers [C]∥Proceedings of the 4th Workshop on Power-Aware Computing and Systems (HotPower’11).Cascais,Portugal,2011:6-10
[10] Horvath T,Skadron K.Multi-mode Energy Management for Multi-tier Server Clusters[C]∥Proceedings of the 17th International Conference on Parallel Architecture and Compilation Techniques (PACT’08).Toronto,Canada,2008:270-279
[11] Xue Zheng-hua,Dong Xiao-she,Ma Si-yuan,et al.An energy-efficient management mechanism for large-scale server clusters[C]∥Proceedings of the 2007IEEE Asia-Pacific Services Computing Conference (APSCC’07).Tsukuba Science City,Japan,2007:509-516
[12] Feitelson D.Parallel Workloads Archive .http://www.cs.huji.ac.il/labs/parallel/workload/l_anl_int/ANL-Intrepid-2009-1.swf.gz,2011

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!