计算机科学 ›› 2019, Vol. 46 ›› Issue (8): 42-49.doi: 10.11896/j.issn.1002-137X.2019.08.007
孙国道, 周志秀, 李思, 刘义鹏, 梁荣华
SUN Guo-dao, ZHOU Zhi-xiu, LI Si, LIU Yi-peng, LIANG Rong-hua
摘要: 社交媒体中,用户所发布的推文内容记录了与用户相关的各种信息。文字信息中涵盖了推文中包含的各种话题,以及时间和空间信息,从这些信息中分析出话题的时空演变情况具有十分重要的研究意义。针对推文数据,设计了一套可视分析流程来挖掘推文信息,通过用户交互的方式多角度地展示了推文话题的时空演变过程。首先,基于部分历史推文数据,通过DBSCAN(Density-Based Spatial Clustering of Applications with Noise)聚类算法,结合泰森多边形对全球地理空间进行区域划分;然后,针对用户查询搜索的兴趣话题,索引找到所有相关的推文内容,并将信息与聚类中心绑定;最后,通过设计的多个结合时序聚类算法和自适应算法的可视化视图来展示话题的时空演变过程。通过推特官网提供的API抓取存储的推文数据,并进行实验和分析,结果表明:改进的可视化视图自适应布局算法有效地解决了图形遮挡问题,完整展现了推文的时空演变模式;地理区域的划分以及可视化组件能够有效帮助研究人员分析推文的时空演变以及全球关注的热点话题分布。
中图分类号:
[1]DWYER N,MARSH S.What can the hashtag #trust tell us about how users conceptualise trust?[C]∥Twelfth InternationalConference on Privacy,Security and Trust.New York:IEEE Press,2014:398-402. [2]ZAPPAVIGNA M.Discourse of Twitter and social media:How we use language to create affiliation on the web[M].A&C Black,2012. [3]IVANOVA M.Understanding microblogging hashtags for learning enhancement[J].Form,2013,11(74):17-23. [4]HUBERMAN B A,ROMERO D M,WU F.Social networks that matter:Twitter under the microscope[J].arXiv:0812.1045,2008. [5]KWAK H,LEE C,PARK H,et al.What is Twitter,a social network or a news media?[C]∥Proceedings of the 19th International Conference on World Wide Web.New York:ACM Press,2010:591-600. [6]YANG J,LESKOVEC J.Modeling information diffusion in implicit networks[C]∥IEEE 10th International Conference on Data Mining (ICDM),2010.New York:IEEE Press,2010:599-608. [7]LERMAN K,GHOSH R.Information contagion:An empirical study of the spread of news on Digg and Twitter social networks[C]∥Proceedings of 4th International Conference on Weblogs and Social Media (ICWSM ).Menlo Park,CA:AAAI Press,2010:90-97. [8]ROMERO D M,MEEDER B,KLEINBERG J.Differences in the mechanics of information diffusion across topics:idioms,political hashtags,and complex contagion on twitter[C]∥Proceedings of the 20th International Conference on World Wide Web.New York:ACM Press,2011:695-704. [9]CUNHA E,MAGNO G,COMARELA G,et al.Analyzing the Dynamic Evolution of Hashtags on Twitter:a Language-Based Approach∥Workshop on Languages in Social Media.Association for Computational Linguistics,2011. [10]MACEACHREN A M,JAISWAL A,ROBINSON A C,et al. Senseplace2:Geotwitter analytics support for situational awareness[C]∥2011 IEEE Conference on Visual Analytics Science and Technology (VAST).New York:IEEE Press,2011:181-190. [11]KONG S,MEI Q,FENG L,et al.Predicting bursts and popularity of hashtags in real-time[C]∥Proceedings of the 37th international ACM SIGIR Conference on Research & Development in Information Retrieval.New York:ACM Press,2014:927-930. [12]MA Z,SUN A,CONG G.Will this# hashtag be popular tomorrow?[C]∥Proceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,2012:1173-1174. [13]TSUR O,RAPPOPORT A.What’s in a hashtag?:content based prediction of the spread of ideas in microblogging communities[C]∥Proceedings of the Fifth ACM International Conference on Web Search and Data Mining.New York:ACM Press,2012:643-652. [14]CHO I,WESSLEN R,VOLKOVA S,et al.CrystalBall:A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data[C]∥IEEE Conference on Visual Analytics Science and Technology (VAST).New York:IEEE Press,2017:25-35. [15]KAMATH K Y,CAVERLEE J,CHENG Z,et al.Spatial in- fluence vs.community influence:modeling the global spread of social media[C]∥Proceedings of the 21st ACM International Conference on Information and Knowledge Management.New York:ACM Press,2012:962-971. [16]HE J,CHEN C.Spatiotemporal Analytics of Topic Trajectory[C]∥Proceedings of the 9th International Symposium on Visual Information Communication and Interaction.New York:ACM Press,2016:112-116. [17]KAMATH K Y,CAVERLEE J.Spatio-temporal meme prediction:learning what hashtags will be popular where[C]∥Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management.New York:ACM Press,2013:1341-1350. [18]HONG L,AHMED A,GURUMURTHY S,et al.Discovering geographical topics in the twitter stream[C]∥Proceedings of the 21st International Conference on World Wide Web.New York:ACM Press,2012:769-778. [19]LU Y,WANG H,LANDIS S,et al.A visual analytics framework for identifying topic drivers in media events[J].IEEE Transactions on Visualization and Computer Graphics,2018,24(9):2501-2515. [20]EL-ASSADY M,SPERRLE F,DEUSSEN O,et al.Visual Analytics for Topic Model Optimization based on User-Steerable Speculative Execution[J].IEEE Transactions on Visualization and Computer Graphics,2019,25(4):1-20. [21]WU Y,CHEN Z,SUN G,et al.Streamexplorer:a multi-stage system for visually exploring events in social streams[J].IEEE Transactions on Visualization and Computer Graphics,2018,24(10):2758-2772. [22]ANDRIENKO G,ANDRIENKO N,FUCHS G,et al.Revealing patterns and trends of mass mobility through spatial and temporal abstraction of origin-destination movement data[J].IEEE Transactions on Visualization & Computer Graphics,IEEE,2017(1):1. [23]MARCUS A,BERNSTEIN M S,BADAR O,et al.Twitinfo:aggregating and visualizing microblogs for event exploration[C]∥Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.New York:ACM Press,2011:227-236. [24]CAO N,LIN Y-R,SUN X,et al.Whisper:Tracing the spatiotemporal process of information diffusion in real time[J].IEEE Transactions on Visualization and Computer Graphics,2012,18(12):2649-2658. |
[1] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于分层抽样优化的面向异构客户端的联邦学习 Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients 计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263 |
[2] | 柴慧敏, 张勇, 方敏. 基于特征相似度聚类的空中目标分群方法 Aerial Target Grouping Method Based on Feature Similarity Clustering 计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203 |
[3] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于DBSCAN聚类的集群联邦学习方法 Clustered Federated Learning Methods Based on DBSCAN Clustering 计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059 |
[4] | 郁舒昊, 周辉, 叶春杨, 王太正. SDFA:基于多特征融合的船舶轨迹聚类方法研究 SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion 计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253 |
[5] | 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞. 基于密度敏感距离和模糊划分的改进FCM算法 FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition 计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042 |
[6] | 陈景年. 一种适于多分类问题的支持向量机加速方法 Acceleration of SVM for Multi-class Classification 计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149 |
[7] | 刘丽, 李仁发. 医疗CPS协作网络控制策略优化 Control Strategy Optimization of Medical CPS Cooperative Network 计算机科学, 2022, 49(6A): 39-43. https://doi.org/10.11896/jsjkx.210300230 |
[8] | 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳. 三维城市场景中的小物体检测 Small Object Detection in 3D Urban Scenes 计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174 |
[9] | 邢云冰, 龙广玉, 胡春雨, 忽丽莎. 基于SVM的类别增量人体活动识别方法 Human Activity Recognition Method Based on Class Increment SVM 计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024 |
[10] | 朱哲清, 耿海军, 钱宇华. 面向化学结构的线段聚类算法 Line-Segment Clustering Algorithm for Chemical Structure 计算机科学, 2022, 49(5): 113-119. https://doi.org/10.11896/jsjkx.210700131 |
[11] | 张宇姣, 黄锐, 张福泉, 隋栋, 张虎. 基于菌群优化的近邻传播聚类算法研究 Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization 计算机科学, 2022, 49(5): 165-169. https://doi.org/10.11896/jsjkx.210800218 |
[12] | 左园林, 龚月姣, 陈伟能. 成本受限条件下的社交网络影响最大化方法 Budget-aware Influence Maximization in Social Networks 计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228 |
[13] | 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞. 基于节点相似性和网络嵌入的复杂网络社区发现算法 Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding 计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009 |
[14] | 韩洁, 陈俊芬, 李艳, 湛泽聪. 基于自注意力的自监督深度聚类算法 Self-supervised Deep Clustering Algorithm Based on Self-attention 计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001 |
[15] | 蒲实, 赵卫东. 一种面向动态科研网络的社区检测算法 Community Detection Algorithm for Dynamic Academic Network 计算机科学, 2022, 49(1): 89-94. https://doi.org/10.11896/jsjkx.210100023 |
|