基于人工免疫危险理论的微博水军用户检测研究

doi:10.11896／j.issn.1002-137X.2018.11.020

摘要/Abstract

摘要： 将人工免疫危险理论引入到用户行为特征的分析中,以有效地识别微博水军用户。以新浪微博为例,分析了新浪微博水军的行为特征,选取微博总数、微博等级、是否认证、阳光信用、粉丝数等特征属性,将属性分析结果作为区别水军与正常用户的特征信号,并基于树突状细胞算法(Dendritic Cells Algorithm,DCA)实现新浪微博水军的识别。使用新浪微博用户的真实数据对算法的有效性进行了验证和对比实验,结果表明该方法能够有效检测出新浪微博中的水军用户,具有较高的检测准确率。

关键词: 人工免疫, 树突状细胞算法, 危险理论, 微博水军, 行为特征

Abstract: This paper introduced the danger theory in artificial immunity system into the analysis of user behavior cha-racteristics to identify the spammers in Weibo effectively.Taking Sina Weibo as an example,this paper analyzed the behavior characteristics of Weibo spammers,selected the total number of Weibo,Weibo level,user authentication,sunshine credit and the number of fans as attribute characteristics and used the analysis results of attribute characteristics as the characteristic signals of distinguishing the spammers and the normal users.After that,the recognition of Sina Weibo spammers can be achieved based on Dendritic Cells Algorithm.The real data of Sina Weibo users was used to verify the effectiveness of the proposed algorithm and conducted comparison experiments.The experimental results suggest that this algorithm can effectively detect the spammers in Sina Weibo and has high detection accuracy.

Key words: Artificial immunity, Behavioral characteristics, Danger theory, Dendritic cells algorithm, Weibo spammers

中图分类号:

TP393

杨超, 秦廷栋, 范波, 李涛. 基于人工免疫危险理论的微博水军用户检测研究[J]. 计算机科学, 2018, 45(11): 138-142. https://doi.org/10.11896／j.issn.1002-137X.2018.11.020

YANG Chao, QIN Ting-dong, FAN Bo, LI Tao. Study on Detection of Weibo Spammers Based on Danger Theory in Artificial Immunity System[J]. Computer Science, 2018, 45(11): 138-142. https://doi.org/10.11896／j.issn.1002-137X.2018.11.020

参考文献

[1]Beijing Internet Information Office.China Weibo Development Report .Beijing:People’s Publishing House,2014:59-106.(in Chinese)
北京互联网信息办公室.中国微博发展报告[M].北京:人民出版社,2014:59-106.
[2]CNNIC.The 40th “China Internet Development Statistics Report” .http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201708/t20170803_69444.htm.(in Chinese)
中国互联网络信息中心.第40次《中国互联网络发展状况统计报告》.http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201708/t20170803_69444.htm.
[3]SRIRAM B,FUHRY D,DEMIR E,et al.Short text classification in twitter to improve information filtering[C]∥Internatio-nal ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2010:841-842.
[4]ZHAO Y Y,QIN B,LIU T.Sentiment Analysis[J].Journal of Software,2010,21(8):1834-1848.(in Chinese)
赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848.
[5]LIU B.Sentiment analysis and subjectivity.https://www. researchgate.net/publication/228667268_Sentiment_analysis_and_subjectivity.
[6]JIN L R.Structure and Content-based SpammerDection in Social Networks[D].Nanjing:Nanjing University of Posts,2016.(in Chinese)
金礼仁.基于结构与内容的社交网络水军团体识别[D].南京:南京邮电大学,2016.
[7]MO Q,YANG K.Overview of Web Spammer Detection[J].Journal of Software,2014,25(7):1505-1526.(in Chinese)
莫倩,杨珂.网络水军识别研究[J].软件学报,2014,25(7):1505-1526.
[8]RAMACHANDRAN A,FEAMSTER N.Understanding the network-level behavior of spammers[C]∥ACM.2006:291-302.
[9]BHAT V H,MALKANI V R,SHENOY P D,et al.Classification of email using BeaKS:Behavior and keyword stemming.https://ieeexplore.ieee.org/document/6129290.
[10]BRENDEL R,KRAWCZYK H.Application of social relation graphs for early detection of transient spammers[M].World Scientific and Engineering Academy and Society (WSEAS),2008:267-276.
[11]ZHANG Y M,HUANG Y Y,GAN S J,et al.Weibo spammers’ identification algorithm based on Bayesian model[J].Journal on Communications,2017,38(1):44-53.(in Chinese)
张艳梅,黄莹莹,甘世杰,等.基于贝叶斯模型的微博网络水军识别算法研究[J].通信学报,2017,38(1):44-53.
[12]LING Z,BAI Z Y,LUO S S,et al.Integrated intrusion detection model based on rough set and artificial immune[J].Journal on Communications,2013,34(9):166-176.(in Chinese)
张玲,白中英,罗守山,等.基于粗糙集和人工免疫的集成入侵检测模型[J].通信学报,2013,34(9):166-176.
[13]YANG C,LI Z Y.Spam mass sending examination based on dendritic cell algorithm[J].Transducer & Microsystem Technologies,2015,34(10):133-136.(in Chinese)
杨超,李子怡.基于树突状细胞算法的垃圾邮件群发检测[J].传感器与微系统,2015,34(10):133-136.
[14]WANG X X,LIANG Y W,et al.Application of dendritic cell algorithm on Web server anomaly detection[J].Computer Engineering & Applications,2016,52(24):148-152.(in Chinese)
王新新,梁意文.树突状细胞算法在Web服务器异常检测中的应用[J].计算机工程与应用,2016,52(24):148-152.
[15]MATZINGER P.Tolerance,Danger and the Extended Family[J].Annual Review of Immunology,1994,12(1):991-1045.
[16]AICKELIN U,BENTLEY P,CAYZER S,et al.Danger Theory:The Link between AIS and IDS?[C]∥Artificial Immune Systems,Second International Conference(ICARIS 2003).Edinburgh,UK,Proceedings.DBLP,2003:147-155.
[17]AICKELIN U,GREENSMITH J,TWYCROSS J.Immune System Approaches to Intrusion Detection－A Review[C]∥International Conference on Artificial Immune Systems.Springer Berlin Heidelberg,2004:316-329.
[18]GREENSMITH J,AICKELIN U,CAYZER S.Introducing Dendritic Cells as a Novel Immune-Inspired Algorithm for Anomaly Detection[M]∥Artificial Immune Systems.Springer Berlin Heidelberg,2005:153-167.
[19]WANG Y Q,LIANG Y W,LIU S.Application-layer DDoS attack detection based on dendritic cell algorithm[J].Computer Engineering and Design,2015,36(4):841-845.(in Chinese)
王亚芹,梁意文,刘赛.基于树突状细胞算法的应用层DDoS攻击检测[J].计算机工程与设计,2015,36(4):841-845.
[20]GREENSMITH J,AICKELIN U.The Dendritic Cell Algorithm[J].RevistaClínica Espaola,2007,202(10):552-554.
[21]YI L L.Research on Statistical Characteristic Analysis and Modeling for User Behavior in Micro-blog Community Based on Human Dynamics[D].Beijing:Beijing University of Posts and Telecommunications,2012.(in Chinese)
易兰丽.基于人类动力学的微博用户行为统计特征分析与建模研究[D].北京:北京邮电大学,2012.
[22]HE L,HE Y,HUO Y Q.Micro-blog user characteristics analysis and core user mining [J].Intelligence Theory and Practice,2011,34(11):121-125.(in Chinese)
何黎,何跃,霍叶青.微博用户特征分析和核心用户挖掘[J].情报理论与实践,2011,34(11):121-125.
[23]WANG X G.Empirical Analysison Behavior Characteristics and Relation Characteristics of Micro-blog Users－Take “SinaMicro-blog” for Example[J].Library and Information Service,2010,54(14):66-70.(in Chinese)
王晓光.微博客用户行为特征与关系特征实证分析——以“新浪微博”为例[J].图书情报工作,2010,54(14):66-70.
[24]Sina Weibo.Sunshine credit[EB/OL].http://service.account.weibo.com/sunshine/guize.
[25]LIAN J,ZHOU X,CAO W,et al.SINA microblog data retrieval[J].Journal of Tsinghua University,2011,51(10):1300-1305.(in Chinese)
廉捷,周欣,曹伟,等.新浪微博数据挖掘方案[J].清华大学学报(自然科学版),2011,51(10):1300-1305.
[26]YANG M,YIN J M,JI G L.Classification Methods on Imba- lanced Data:a Survey[J].Journal of Nanjing Normal University,2008,8(4):7-12.(in Chinese)
杨明,尹军梅,吉根林.不平衡数据分类方法综述[J].南京师范大学学报(工程技术版),2008,8(4):7-12.

相关文章 15

[1]	张慧. 基于程序变异和高斯混合聚类的错误定位技术 Fault Localization Technology Based on Program Mutation and Gaussian Mixture Model 计算机科学, 2021, 48(6A): 572-574. https://doi.org/10.11896/jsjkx.200500121
[2]	贾琳, 杨超, 宋玲玲, 程镇, 李琲珺. 改进的否定选择算法及其在入侵检测中的应用 Improved Negative Selection Algorithm and Its Application in Intrusion Detection 计算机科学, 2021, 48(6): 324-331. https://doi.org/10.11896/jsjkx.200400033
[3]	袁禄, 朱郑州, 任庭玉. 虚假评论识别研究综述 Survey on Fake Review Recognition 计算机科学, 2021, 48(1): 111-118. https://doi.org/10.11896/jsjkx.200500101
[4]	张敏军, 华庆一. 基于概率矩阵分解算法的社交网络用户兴趣点个性化推荐 Personalized Recommendation of Social Network Users' Interest Points Based on ProbabilityMatrix Decomposition Algorithm 计算机科学, 2020, 47(12): 144-148. https://doi.org/10.11896/jsjkx.191000064
[5]	葛绍林, 叶剑, 何明祥. 基于深度森林的用户购买行为预测模型 Prediction Model of User Purchase Behavior Based on Deep Forest 计算机科学, 2019, 46(9): 190-194. https://doi.org/10.11896/j.issn.1002-137X.2019.09.027
[6]	陈晋音, 黄国瀚, 吴洋洋, 贾澄钰. 基于双循环图的虚假评论检测算法 Double Cycle Graph Based Fraud Review Detection Algorithm 计算机科学, 2019, 46(9): 229-236. https://doi.org/10.11896/j.issn.1002-137X.2019.09.034
[7]	李存燕, 洪玫. Github中开发人员的行为特征分析 Analysis on Behavior Characteristics of Developers in Github 计算机科学, 2019, 46(2): 152-158. https://doi.org/10.11896/j.issn.1002-137X.2019.02.024
[8]	马元锋,李昂儒,余慧敏,潘晓英. 基于动态拥挤距离的混合多目标免疫优化算法 Dynamic Crowding Distance-based Hybrid Immune Algorithm for Multi-objective Optimization Problem 计算机科学, 2018, 45(6A): 63-68.
[9]	朱朝阳,陈相舟,闫龙,张信明. 基于主成分分析法的人工免疫识别软件缺陷预测模型研究 Research on Software Defect Prediction Based on AIRS Using PCA 计算机科学, 2017, 44(Z6): 483-485. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.107
[10]	杨超,李涛. 计算机免疫危险理论中危险信号的提取方法研究 Research of Danger Signal Extraction Based on Changes in Danger Theory 计算机科学, 2015, 42(8): 170-174.
[11]	陈妍伶,汤光明,孙怡峰. 基于免疫危险理论的网络安全态势评估 Assessment of Network Security Situation Based on Immune Danger Theory 计算机科学, 2015, 42(6): 167-170. https://doi.org/10.11896/j.issn.1002-137X.2015.06.036
[12]	左万利,韩佳育,刘露,王英,彭涛. 基于人工免疫算法的增量式用户兴趣挖掘 Incremental User Interest Mining Based on Artificial Immune Algorithm 计算机科学, 2015, 42(5): 34-41. https://doi.org/10.11896/j.issn.1002-137X.2015.05.007
[13]	杜楠,韩兰胜,付才,张忠科,刘铭. 基于相识度的恶意代码检测 Detection of Malware Code Based on Acquaintance Degree 计算机科学, 2015, 42(1): 187-192. https://doi.org/10.11896/j.issn.1002-137X.2015.01.042
[14]	李志,谢强. 一种基于改进粒子滤波的运动目标跟踪 Moving Target Tracking Based on Improved Particle Filter 计算机科学, 2014, 41(2): 232-235.
[15]	冯翔,马美怡,赵天玲,虞慧群. 基于复合免疫算法的入侵检测系统 Intrusion Detection System Based on Hybrid Immune Algorithm 计算机科学, 2014, 41(12): 43-47. https://doi.org/10.11896/j.issn.1002-137X.2014.12.010

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed