计算机科学 ›› 2017, Vol. 44 ›› Issue (Z11): 385-390.doi: 10.11896/j.issn.1002-137X.2017.11A.081

• 大数据与数据挖掘 • 上一篇    下一篇

基于兴趣的社交网络用户聚类及可视化

汤颖,钟南江,孙康高,秦大康,周伟华   

  1. 浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023,南通大学理学院 南通226019,浙江大学管理学院 杭州310027
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家教育部新世纪优秀人才支持计划(NCET-13-0526),国家自然科学基金(71571160),浙江省自然科学基金(LY14F020021)资助

Clustering and Visualization of Social Network Based on User Interests

TANG Ying, ZHONG Nan-jiang, SUN Kang-gao, QIN Da-kang and ZHOU Wei-hua   

  • Online:2018-12-01 Published:2018-12-01

摘要: 随着社交网络的流行,从各种各样的社交网络数据中提取出有效信息并进行清晰直观的可视化分析,从而为用户提供有价值的潜在知识,显得尤为重要。聚类分析是数据挖掘中的重要分析手段,传统的面向社交网络数据的用户聚类分析大都仅考虑网络的拓扑链接结构,未考虑用户的兴趣相似度。文中基于贝叶斯概率模型来计算用户兴趣相似度并进行聚类,进一步设计交互可视化方式来展示上述聚类结果。具体地,针对社交网络中的用户评分数据 建立潜在语义模型来提取表示每个用户兴趣特点的特征向量;基于用户的特征向量对用户进行聚类,得到具有不同特征的人群,并通过实验和热度图选择合适的人群聚类数;最后提出了基于层次气泡图的可视化展现和分析方案,将用户、电影类型、电影等多维信息在图形中交互展示,支持用户从全局概览到局部细节的推进式探索,从多角度可视化人群特征。对豆瓣网用户和电影评分数据进行了实验和分析,结果验证了所提方法的有效性。

关键词: 社交网络,聚类,数据可视化,潜在语义模型

Abstract: With the development of social network,it becomes more and more important to extract useful information from the social network and provide valuable knowledge to users in an interactive visual interface intuitively.Clustering,as a crucial method in data mining,offers the global data analysis results.Traditional clustering methods of social network data mainly consider network topological structure.However, they haven’t considered the user interests for clustering.In this paper,the users are clustered by computing user-interest similarity based on Bayesian probabilistic model,furthermore,the interactive visualization method is designed to present the user clustering results.Specifically,we computed the feature vectors representing users’ interests based on latent semantic model.Then clusters with different interest characteristics were built based on these feature vectors.The suitable number of clusters are determined by heat map visualization results.Finally,we presented the interactive visualization method based on hierarchical bubble chart to support users to explore the clustering results from the global overview to local details.We performed experiments and analysis with data crawled from Douban website.The results validate the effectiveness of our method.

Key words: Social network,Clustering,Data visualization,Latent semantic model

[1] GOU L,YOU F,GUO J,et al.Sfviz:interest-based friends exploration and recommendation in social networks[C]∥Procee-dings of the 2011 Visual Information Communication-InternationalSymposium.ACM,2011:15.
[2] KRULWICH B.Lifestyle finder:intelligent user profiling using large-scale demographic data[J].Artificial Intelligence Magazine,1997,18(2):37-45.
[3] HOFMANN T.Latent semantic models for collaborative filte-ring[J].ACM Transactions on Information Systems (TOIS),2004,22(1):89-115.
[4] HOFMANN T.Probabilistic latent semantic indexing[C]∥Proceedings of the 22nd Annual International ACM SIGIR Con-ference on Research and Development in Information Retrieval.ACM,1999:50-57.
[5] GOLUB G H,REINSCH C.Singular value decomposition andleast squares solutions[J].Numerische Mathematik,1970,14(5):403-420.
[6] WOLD S,ESBENSEN K,GELADI P.Principal component ana-lysis[J].Chemometrics and Intelligent Laboratory Systems,1987,2(1-3):37-52.
[7] YE M,LIU X,LEE W C.Exploring social influence for recommendation:a generative model approach[C]∥Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2012:671-680.
[8] CARD S K,MACKINLAY J D,SHNEIDERMAN B.Readings in Information Visualization:Using Vision To Think[M].San Francisco:MorganKaufmann Publishers,1999:1-712.
[9] HERMAN I,MELANCON G,MARSHALL M S.Graph visuali-zation and navigation in information visualization:A survey[J].IEEE Trans.On Visualization and Computer Graphics,2000,6(1):24-43.
[10] JOHNSON B,SHNEIDERMAN B.Tree-maps:a space-fillingapproach to the visualization of hierarchical information structures[C]∥IEEE Conference on Visualization’91.IEEE,1991:284-291.
[11] JOHNSON B,SHNEIDERMAN B.Tree-maps:a space-fillingapproach to the visualization of hierarchical information structures[C]∥IEEE Conference on Visualization’91.IEEE,1991:284-291.
[12] BALZER M,DEUSSEN O,LEWERENTZ C.Voronoi treemaps for the visualization of software metrics.[C]∥Proceedings of the 2005 ACM Symposium on Software Visualization.New York:ACM,2005:165-172.
[13] FRIENDLY M.A brief history of the mosaic display[J].Journal of Computational and Graphical Statistics,2002,11(1):89-107.
[14] WANG W,WANG H,DAI G,et al.Visualization of large hierar-chical data by circle packing[C]∥Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.New York:ACM,2006:517-520.
[15] KEIM D,ANDRIENKO G,FEKETE J D,et al.Visual analy-tics:Definition,process,and challenges[M].Springer Berlin Heidelberg,2008.
[16] PIKE W A,STASKO J,CHANG R,et al.The science of interac-tion[J].Information Visualization,2009,8(4):263-274.
[17] KAUFMAN L,ROUSSEEUW P J.Finding groups in data:anintroduction to cluster analysis[M].John Wiley & Sons,2009.
[18] EICK S G.Graphically displaying text[J].Journal of Computational and Graphical Statistics,1994,3:127-142.
[19] STASKO J.Information visualization.http://www.cc.gatech.edu/ classes /AY2004/cs7450_spring.
[20] FENG Y D,WANG G P,DONG S H.Information Visualization[J].Journal of Engineering Graphics,2001:324-329.
[21] SMITH M A,SHNEIDERMAN B,M ILIC-FRAYLIN N,et al.Analyzing (social media) networks with NodeXL[C]∥Procee-dings of the Fourth International Conference on Communities and Technologies.ACM,2009:255-264.
[22] HENRY N,FEKETE J D.MatrixExplorer:a Dual-Representation System to Explore Social Networks[J].IEEE Transactions on Visualization & Computer Graphics,2006,12(5):677-684.
[23] FRUCHTERMANN T M J,REINGOLD E M.Graph drawing by force-directed placement[J].Software:Practice andexperien-ce,1991,21(11):1129-1164.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!