计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 129-134.doi: 10.11896/j.issn.1002-137X.2019.05.020

• 信息安全 • 上一篇    下一篇

基于拓扑扩展的在线社交网络恶意信息源定位算法

袁得嵛1,2, 高见1,2, 叶萌熙1, 王小娟3   

  1. (中国人民公安大学信息技术与网络安全学院 北京102623)1
    (安全防范与风险评估公安部重点实验室 北京102623)2
    (北京邮电大学电子工程学院 北京100876)3
  • 收稿日期:2018-04-12 修回日期:2018-07-27 发布日期:2019-05-15
  • 作者简介:袁得嵛(1986-),男,博士,讲师,主要研究方向为网络安全、复杂网络,E-mail:yuandeyu@gmail.com(通信作者);高 见(1982-),男,博士,讲师,主要研究方向为网络攻防、僵尸网络;叶萌熙(1997-),男,主要研究方向为网络安全;王小娟(1985-),女,博士,副教授,主要研究方向为复杂网络、智能算法。
  • 基金资助:
    国家自然科学基金资助项目(61771072),北京市自然科学基金资助项目(4184099),公安部科技强警基础工作专项(2017GABJC38),中国人民公安大学基本科研业务费项目(2016JKF01317)资助。

Malicious Information Source Locating Algorithm Based on Topological Extension in Online Social Network

YUAN De-yu1,2, GAO Jian1,2, YE Meng-xi1, WANG Xiao-juan3,   

  1. (Institute of Information Technology and Cyber Security,People’s Public Security University of China,Beijing 102623,China)1
    (Key Laboratory of Safety Precautions and Risk Assessment,Ministry of Public Security,Beijing 102623,China)2
    (School of Electronic Engineering,Beijing University of Posts and Telecommunications,Beijing 100876,China)3
  • Received:2018-04-12 Revised:2018-07-27 Published:2019-05-15

摘要: 随着在线社交网络的飞速发展,社交媒体成为网络用户参与的主要平台。恶意信息常常隐藏于在线社交网络的海量数据中,加之拓扑结构的局部性、恶意信息的伪装性,定位和溯源恶意信息面临着很大困难。一方面,仅仅通过人工标注的方式难以实现全局监控,即使借助语义分析、信息搜索等方式也只能在识别热点后获取当前网络中的信息“碎片”;另外,信息在演变过程中的变异,使得一条传播链条会中断分裂为多条,不加以识别和区分会增加信息源头数目,大大增加溯源定位的算法复杂度。另一方面,恶意信息常采用伪装手法,如提供虚假的要素、制造热点吸引用户、水军炒作干扰视线等,使得信息拓扑和网络关系拓扑并不一致。原有的定位算法依赖于当前感染节点的分布和当前拓扑,感染状态从单一化向随机化的变化,使得统计推断框架更复杂,需要改进非观测节点的状态推断方法。在线社交网络的信息传播过程中,信息的传播关系常常附加在信息本身中,可以根据当前网络节点的状态挖掘隐藏信息。结合当前网络节点的状态,提出关系拓扑和信息拓扑的概念,并设计基于信息拓扑的候选源点扩展算法。在此基础上,文中提出基于Jordan中心的恶意信息溯源算法。在模型网络和实际网络上的实验表明,相对于对比算法,所提算法能够有效识别出恶意信息源。

关键词: 恶意信息源定位, 拓扑扩展, 在线社交网络

Abstract: Social media on the Internet has become the main interactive platform for online users with the rapid development of online social network.Malicious information often hides in the massive data of online social networks.In addition,the limitation of topologies and the disguise of malicious information bring a lot of difficulties for locating and tra-cing malicious information.On the one hand,it is difficult to achieve global monitoring by means of manual labeling.Evenby means of semantic analysis and information search,only the information “fragments” in the current network can be obtained after the hotspots are recognized.Coupled with the variation of information in the evolution process,a chain of propagation will be interrupted and split into multiple pieces.Without identifying and distinguishing,the number of information sources will be increased,and the algorithm complexity will be greatly increased.On the other hand,malicious information often uses camouflage techniques,such as providing false elements,manufacturing hotspots to attract users,and manipulating online “water army” to interfere with the public opinions,which makes the information topology and relation topology inconsistent.The original locating algorithm relies on the distribution of the current infected nodes and the current topology.The singularization of the infection state changes to randomization,making the statistical inference framework more complex.It is necessary to improve the state inference method of non-observed nodes.In the process of information dissemination of online social networks,the information propagation relationship is often attached to the information itself.Therefore,hidden information can be mined according to the state of the current network node.Based on the current state of network nodes,this paper proposed the concepts of relation topology and information topology and designed a candidate source expansion algorithm based on information topology.Based on this,this paper also pre-sented a malicious information locating algorithm based on Jordan center.Experiments on the generated network and real network show that the algorithm can effectively identify malicious information sources compared with other algorithms.

Key words: Malicious information source locating, Online social networks, Topological expansion

中图分类号: 

  • TP309
[1]THOMSON R,ITO N,SUDA H,et al.Trusting tweets:The Fukushima disaster and information source credibility on Twitter[C]∥Proceedings of the 9th International ISCRAM Confe-rence.2012:1-10.
[2]GHOSH S,SHARMA N,BENEVENUTO F,et al.Cognos:crowdsourcing search for topic experts in microblogs[C]∥Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2012:575-590.
[3]YANG F,LIU Y,YU X,et al.Automatic detection of rumor on sinaweibo[C]∥Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics.ACM,2012.
[4]DIAKOPOULOS N,DE CHOUDHURY M,NAAMAN M.
Finding and assessing social media information sources in the context of journalism[C]∥Proceedings of the SIGCHIConfe-rence on Human Factors in Computing Systems.ACM,2012:2451-2460.
[5]SUN S,LIU H,HE J,et al.Detecting Event Rumors on Sina Weibo Automatically [M]∥Web Technologies and Applications.Berlin:Springer,2013:120-131.
[6]CANINI K R,SUH B,PIROLLI P L.Finding Credible Information Sources In Social Networks Based On Content And Social Structure[C]∥2011 IEEE Third International Conference on and 2011 Privacy,Security,Risk and Trust (PASSAT).2011:1-8.
[7]GUAN W,GAO H,YANG M,et al.Analyzing user behavior of the micro-blogging website Sina Weibo during hot social events[J].Physica A Statistical Mechanics & Its Applications,2014,395(4):340-351.
[8]LIU Y,XU S,TOURASSI G.Detecting Rumors Through Mo-deling Information Propagation Networks in a Social Media Environment[M]∥Social Computing,Behavioral-Cultural Mode-ling,and Prediction.Springer International Publishing,2015:121-130.
[9]GONZALEZ-BAILON S,BORGE-HOLTHOEFER J,MORENOY.Broadcasters and Hidden Influentials in Online Protest Diffusion[J].American Behavioral Scientist,2012,57(7):943-965.
[10]CHEN W,LU W,ZHANG N.Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process[C]∥Twenty-Sixth AAAI Conference on Artificial Intelligence.2012.
[11]ZHOU T,LIU J G,BAI W J,et al.Behaviors of susceptible-infected epidemics on scale-free networks with identical infectivity[J].Physical Review E,2006,74(5):056109.
[12]LEE H K,SHIM P S,NOH J D.Epidemic threshold of the susceptible-infected-susceptible model on complex networks.[J].Physical Review E,2013,87(6):88-90.
[13]YUAN-YUAN M A,ZHUANG X T.Susceptible-infected-re-moved (SIR) model of crisis spreading in the correlated network of listed companies and their main stock-holders[J].Journal of Management Sciences in China,2013,7(16):80-94.
[14]BIANCHI P,DEBBAH M,MAIDA M,et al.1Performance of Statistical Tests for Single Source Detection using Random Matrix Theory[J].IEEE Transactions on Information Theory,2010,57(4):2400-2419.
[15]WANG Z,DONG W,ZHANG W,et al.Rooting out RumorSources in Online Social Networks:The Value of Diversity from Multiple Observations[J].IEEE Journal of Selected Topics in Signal Processing,2015,9(4):663-677.
[16]ZANG W,ZHANG P,ZHOU C,et al.Discovering Multiple Diffusion Source Nodes in Social Networks[J].Procedia Computer Science,2014,29:443-452.
[17]ANTULOV-FANTULIN N,LANCIC A,STEFANCIC H,et al.Statistical inference framework for source detection of contagion processes on arbitrary network structures[C]∥Procee-dings of 2014 IEEE Eighth International Conference on Self-Adaptive and Self-Organizing Systems Workshops.2013:78-83.
[18]DONG W X,ZHANG W,TAN C W.Rooting out the Rumor Culprit from Suspects[C]∥2013 IEEE International Sympo-sium on Information Theory Proceedings (ISIT).IEEE,2015,18(3):731-747.
[19]ZHANG E,WANG G,GAO K,et al.Finding critical blocks of information diffusion in social networks[J].World Wide Web-internet & Web Information Systems,2013:521-532.
[20]ANTULOV-FANTULIN N,LANCIC A,SMUC T,et al.De-tectability limits of epidemic sources in networks[J].Physical Review Letter,2014,114(10):589-596.
[21]LUO W,TAY W P,LENG M.Identifying Infection Sources and Regions in Large Networks[J].IEEE Transactions on Signal Processing ,2013,61(11):2850-2865.
[22]RICHARDSON M,DOMINGOS P.Mining Knowledge-Sharing Sites for Viral Marketing[C]∥Eighth Intl.Conf.on Knowledge Discovery and Data Mining.2002:61-70.
[23]KEMPE D.Maximizing the Spread of Influence Through a Social Network[J].Kdd Proceedings of the Ninth AcmSigkdd International Conference on Knowledge Discovery & Data,2003:137-146.
[24]CHEN W,WANG Y,YANG S.Efficient influence maximization in social networks[C]∥Proc. of Acm Kdd.2009:199-208.
[25]GOYAL A,BONCHI F,LAKSHMANAN L V S.A Data-Based Approach to Social Influence Maximization[J].Proceedings of the VLDB Endowment,2011,1(5):73-84.
[26]ZHUANG H,SUN Y,TANG J,et al.Influence Maximization in Dynamic Social Networks[C]∥2013 IEEE 13th International Conference on Data Mining (ICDM).IEEE,2013:1313-1318.
[27]PINTO P C,THIRAN P,VETTERLI M.Locating the Source of Diffusion in Large-Scale Networks[J].Physical Review Letters,2012,109(6):1-5.
[28]BORGE-HOLTHOEFER J,RIVERO A,MORENO Y.Locating privileged information spreaders during political protests on an Online Social Network[J].Holthoefer,2011,85(6):4454-4463.
[29]LUO W,MEMBER S,TAY W P,et al.How to Identify an Infection Source with Limited Observations[J].IEEE Journal of Selected Topics in Signal Processing,2013,8(4):586-597.
[30]SHAH D,ZAMAN T.Rumor centrality:a universal source detector[C]∥ACM SIGMETRICS Performance Evaluation Review.ACM,2012,40(1):199-210.
[31]ZHU K,YING L.Information source detection in the SIR mo-del:A sample path based approach[C]∥Information Theory and Applications Workshop (ITA).2013:1-9.
[32]SHAH D,ZAMAN T.Detecting sources of computer viruses in networks:theory and experiment[J].Acm Sigmetrics perfor-mance Evaluation Review,2010,38(1):203-214.
[33]ZHU K,YING L.A robust information source estimator with sparse observations[C]∥INFOCOM.IEEE,2014:2211-2219.
[34]LUO W,TAY W P,LENG M.On the Universality of Jordan Centers for Estimating Infection Sources in Tree Networks[J].IEEE Transactions on Information Theory,2014,PP(99):1-1.
[35]ADAMIC L,GLANCE N.The political blogosphere and the2004 US election[C]∥International Workshop on Link Disco-very.ACM Press,2005:36-43.
[1] 袁得嵛, 陈世聪, 高见, 王小娟.
基于斯塔克尔伯格博弈的在线社交网络扭曲信息干预算法
Intervention Algorithm for Distorted Information in Online Social Networks Based on Stackelberg Game
计算机科学, 2021, 48(3): 313-319. https://doi.org/10.11896/jsjkx.200400079
[2] 孙永樾, 李红燕, 张金波.
RAISE:一种高效的社交网络影响成本最小化算法
RAISE:Efficient Influence Cost Minimizing Algorithm in Social Network
计算机科学, 2019, 46(9): 59-65. https://doi.org/10.11896/j.issn.1002-137X.2019.09.007
[3] 黄建一, 李建江, 王铮, 方明哲.
基于上下文相似度矩阵的Single-Pass短文本聚类
Single-Pass Short Text Clustering Based on Context Similarity Matrix
计算机科学, 2019, 46(4): 50-56. https://doi.org/10.11896/j.issn.1002-137X.2019.04.008
[4] 贺超波,杨镇雄,洪少文,汤庸,陈国华,郑凯.
应用随机游走的社交网络用户分类方法
User Classification Method in Online Social Network Using Random Walks
计算机科学, 2015, 42(2): 198-203. https://doi.org/10.11896/j.issn.1002-137X.2015.02.042
[5] 张星,於志文,梁韵基,郭斌.
基于交互相似度的细粒度社群发掘方法
Community Development Method Based on Interactive Similarity
计算机科学, 2014, 41(4): 215-218.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!