计算机科学 ›› 2013, Vol. 40 ›› Issue (Z11): 235-237.
胡长龙,唐晋韬,王挺
HU Chang-long,TANG Jin-tao and WANG Ting
摘要: Hashtag(微博话题词)是发布者为微博信息创建的话题标签,能帮助用户在海量微博数据中高效发现热点话题。Hashtag由用户创建的特性使得不同的Hashtag可能代表着同一个话题,挖掘Hashtag之间的话题相关性将有助于热点话题发现和聚合展示。研究了Hashtag之间相关性分析问题,抽取了Hashtag文本特征、微博内容、Hashtag的出现次数-时间分布以及Hashtag共现等一系列特征,以分析Hashtag之间的话题相关性。在新浪微博数据上的实验结果显示,这一系列特征组合能较好地帮助Hashtag相关性分析。
[1] Rosa K D,Shah R,Lin B,et al.Topical clustering of tweets[C]∥Proceedings of the ACM SIGIR:SWSM.2011 [2] Sankaranarayanan J,Samet H,Teitler B E,et al.Twitterstand:news in tweets[C]∥Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems.ACM,2009:42-51 [3] 张晓艳.新闻话题表示模型和关联追踪技术研究[D].长沙:国防科学技术大学,2010 [4] Pschko J.Exploring Twitter Hashtags[Z].2011 [5] Antenucci D,Handy G,Modi A,et al.Classification of Tweets Via Clustering of Hashtags[Z].2011 [6] 郑斐然,苗夺谦,张志飞.一种中文微博新闻话题检测的方法[J].计算机科学,2012,39(1):138 [7] Cataldi M,Di Caro L,Schifanella C.Emerging topic detection on Twitter based on temporal and social terms evaluation[C]∥Proceedings of the Tenth International Workshop on Multimedia Data Mining.ACM,2010:4 [8] Chang H C.A new perspective on twitter Hashtag use:diffusion of innovation theory[J].Proceedings of the American Society for Information Science and Technology,2010,47(1):1-4 [9] 随机森林-维基百科,自由的百科全书[DB/OL].http://zh.wikipedia.org/wiki/随机森林,2013 [10] Leydesdorff L.On the normalization and visualization of author cocitation data:Salton’s Cosine versus the Jaccard index[J].Journal of the American Society for Information Science and Technology,2008,59(1):77-85 [11] Laniado D,Mika P.Making sense of twitter[M].The Semantic Web-ISWC 2010.Springer Berlin Heidelberg,2010:470-485 [12] Guo W,Li H,Ji H,et al.Linking Tweets to News:A Framework to Enrich Short Text Data in Social Media [13] Wang A H.Don’t follow me:Spam detection in twitter[C]∥Security and Cryptography (SECRYPT),Proceedings of the 2010International Conference on.IEEE,2010:1-10 [14] Benevenuto F,Magno G,Rodrigues T,et al.Detecting spammers on twitter[C]∥Collaboration,electronic messaging,anti-abuse and spam conference (CEAS).2010 [15] Cilibrasi R L,Vitanyi P M B.The google similarity distance[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(3):370-383 |
No related articles found! |
|