计算机科学 ›› 2018, Vol. 45 ›› Issue (10): 160-165.doi: 10.11896/j.issn.1002-137X.2018.10.030
王建, 张仰森, 陈若愚, 蒋玉茹, 尤建清
WANG Jian, ZHANG Yang-sen, CHEN Ruo-yu, JIANG Yu-ru, YOU Jian-qing
摘要: 随着互联网络技术的快速发展,各种恶意访问行为危及到网络的信息安全,因此辨识访问用户的角色并识别用户的恶意访问行为对于网络安全具有十分重要的理论意义和实用价值。首先,以网络日志数据为基础,通过建立IP辅助数据库,构建IP用户的日角色模型,在此基础上,引入滑动时间窗技术,将时间的变化动态地融入用户角色辨识,建立了基于滑动时间窗的用户角色动态辨识模型。然后,在分析用户恶意访问流量特征的基础上,将用户访问流量特征和用户信息熵特征进行加权,构建基于多特征的用户恶意访问行为的辨识模型。该模型能够对爆发性和高持续性的恶意访问行为以及少量但大规模分散访问的恶意行为进行识别。最后,采用大数据存储和Spark内存计算技术,对所建立的模型进行实现。实验结果表明,在网络流量产生异常时,所提出的模型能够发现具有恶意访问行为的用户,并准确且高效地辨别出该用户的角色,从而验证了其有效性。
中图分类号:
| [1]KEMMAR A,LEBBAH Y,LOUDNI S.A Constraint Programming Approach for Web Log Mining[J].International Journal of Information Technology and Web Engineering (IJITWE),2016,11(4):24-42. [2]SISODIA D S,VERMA S,VYAS O P.Agglomerative Approach for Identification and Elimination of Web Robots from Web Server Logs to Extract Knowledge about Actual Visitors[J].Journal of Data Analysis and Information Processing,2015,3(1):1-10.[3]JOSHILA GRACE L K,MAHESWARI V,NAGAMALAI D. Analysis of Web Logs And Web User In Web Mining[J].International Journal of Network Security & Its Applications,2011,3(1):99-110. [4]XU X F,YANG L,WANG W.Novel role analysis method for network domain users[J].Chinese Journal of Network and Information Security,2017,3(3):22-27.(in Chinese) 许小丰,杨力,王巍.新颖的网络域名用户关键角色识别方法[J].网络与信息安全学报,2017,3(3):22-27. [5]CHEN M S,PARK J S,YU P S.Efficient data mining for path traversal patterns[J].IEEE Transactions on Knowledge and Data Engineering,1998,10(2):209-221. [6]XU J J,CHEN H.CrimeNet explorer:a framework for criminal network knowledge discovery[J].ACM Transactions on Information Systems (TOIS),2005,23(2):201-226. [7]GUO Y,BAI S,YANG Z F,et al.Analyzing Scale of Web Logs and Mining Users’ Interests [J].Chinese Journal ofCompu-ters,2005,28(9):1483-1496.(in Chinese) 郭岩,白硕,杨志峰,等.网络日志规模分析和用户兴趣挖掘[J].计算机学报,2005,28(9):1483-1496. [8]XING D S,SHEN J Y,SONG Q B.Discovering Preferred Browsing Paths from Web Logs [J].Chinese Journal of Computers,2003,26(11):1518-1523.(in Chinese) 邢东山,沈钧毅,宋擒豹.从Web日志中挖掘用户浏览偏爱路径[J].计算机学报,2003,26(11):1518-1523. [9]JIN X.Web Log Mining Based-on Improved Double-Points Crossover Genetic Algorithm[J].Journal of Multimedia,2014,9(6):804-809.(in Chinese) [10]YANG J G,WANG X T,LIU G Q.DDoS attack detection method based on network traffic and IP entropy[J].Application Research of Computers,2016,33(4):1145-1149.(in Chinese) 杨君刚,王新桐,刘故箐.基于流量和IP熵特性的DDoS攻击检测方法[J].计算机应用研究,2016,33(4):1145-1149. [11]SAIED A,OVERILL R E,RADZIK T.Detection of known and unknown DDoS attacks using Artificial Neural Networks[J].Neurocomputing,2016,172(C):385-393. [12]LEUNG K,LECKIE C.Unsupervised anomaly detection in network intrusion detection using clusters[C]∥Proceedings of Australasian Computer Science Conference.Australia,2005.333-342. [13]RUBINSTEIN B,NELSON B,HUANG L,et al.Stealthy poisoning attacks on PCA-based anomaly detectors[J].Acm Sigmetrics Performance Evaluation Review,2009,37(2):73-74. [14]LI Q,CHI L J,ZHANG Z X.A Novel Approach to Simulate DDoS Attack[J].International Journal of Wireless and Microwave Technologies(IJWMT),2011,1(2):33-40. [15]SUN Z X,LI Q D.Defending DDos Attacks Based on the Source and Destination IP Address Database [J].Journal of Software,2007,18(10):2613-2623.(in Chinese) 孙知信,李清东.基于源目的IP地址对数据库的防范DDos攻击策略[J].软件学报,2007,18(10):2613-2623. [16]GUI B X,ZHOU K,ZHOU W L.An IP Traceback Model Based Traffic Entropy Variations for DDoS Attacks[J].Journal of Chinese Computer Systems,2013,34(7):1607-1609.(in Chinese) 桂兵祥,周康,周万雷.通信流熵变量DDoS攻击IP回溯跟踪模型[J].小型微型计算机系统,2013,34(7):1607-1609. [17]LI Q,SHEN T,GUAN Y.Research on Clustering Algorithm for Large Data Sets[J].Intelligent Computer and Applications,2012,2(5):42-45.(in Chinese) 李清,沈彤,关毅.面向大规模日志数据的聚类算法研究[J].智能计算机与应用,2012,2(5):42-45. [18]ZHAO L.The Design and Implementation of Massive Search Logs Analysis Platform Based on Hadoop[D].Dalian:Dalian University of Technology,2013.(in Chinese) 赵龙.基于Hadoop的海量搜索日志分析平台的设计和实现[D].大连:大连理工大学,2013. | 
| [1] | 黎嵘繁, 钟婷, 吴劲, 周帆, 匡平. 基于时空注意力克里金的边坡形变数据插值方法 Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation 计算机科学, 2022, 49(8): 33-39. https://doi.org/10.11896/jsjkx.210600161 | 
| [2] | 么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明. 大数据驱动的社会经济地位分析研究综述 Big Data-driven Based Socioeconomic Status Analysis:A Survey 计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014 | 
| [3] | 孔钰婷, 谭富祥, 赵鑫, 张正航, 白璐, 钱育蓉. 基于差分隐私的K-means算法优化研究综述 Review of K-means Algorithm Optimization Based on Differential Privacy 计算机科学, 2022, 49(2): 162-173. https://doi.org/10.11896/jsjkx.201200008 | 
| [4] | 张亚迪, 孙悦, 刘锋, 朱二周. 结合密度参数与中心替换的改进K-means算法及新聚类有效性指标研究 Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index 计算机科学, 2022, 49(1): 121-132. https://doi.org/10.11896/jsjkx.201100148 | 
| [5] | 马董, 李新源, 陈红梅, 肖清. 星型高影响的空间co-location模式挖掘 Mining Spatial co-location Patterns with Star High Influence 计算机科学, 2022, 49(1): 166-174. https://doi.org/10.11896/jsjkx.201000186 | 
| [6] | 徐慧慧, 晏华. 基于相对危险度的儿童先心病风险因素分析算法 Relative Risk Degree Based Risk Factor Analysis Algorithm for Congenital Heart Disease in Children 计算机科学, 2021, 48(6): 210-214. https://doi.org/10.11896/jsjkx.200500082 | 
| [7] | 张岩金, 白亮. 一种基于符号关系图的快速符号数据聚类算法 Fast Symbolic Data Clustering Algorithm Based on Symbolic Relation Graph 计算机科学, 2021, 48(4): 111-116. https://doi.org/10.11896/jsjkx.200800011 | 
| [8] | 张寒烁, 杨冬菊. 基于关系图谱的科技数据分析算法 Technology Data Analysis Algorithm Based on Relational Graph 计算机科学, 2021, 48(3): 174-179. https://doi.org/10.11896/jsjkx.191200154 | 
| [9] | 邹承明, 陈德. 高维大数据分析的无监督异常检测方法 Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis 计算机科学, 2021, 48(2): 121-127. https://doi.org/10.11896/jsjkx.191100141 | 
| [10] | 刘新斌, 王丽珍, 周丽华. MLCPM-UC:一种基于模式实例分布均匀系数的多级co-location模式挖掘算法 MLCPM-UC:A Multi-level Co-location Pattern Mining Algorithm Based on Uniform Coefficient of Pattern Instance Distribution 计算机科学, 2021, 48(11): 208-218. https://doi.org/10.11896/jsjkx.201000097 | 
| [11] | 刘晓楠, 宋慧超, 王洪, 江舵, 安家乐. Grover算法改进与应用综述 Survey on Improvement and Application of Grover Algorithm 计算机科学, 2021, 48(10): 315-323. https://doi.org/10.11896/jsjkx.201100141 | 
| [12] | 张煜, 陆亿红, 黄德才. 基于密度峰值的加权犹豫模糊聚类算法 Weighted Hesitant Fuzzy Clustering Based on Density Peaks 计算机科学, 2021, 48(1): 145-151. https://doi.org/10.11896/jsjkx.200400043 | 
| [13] | 游兰, 韩雪薇, 何正伟, 肖丝雨, 何渡, 潘筱萌. 基于改进Seq2Seq的短时AIS轨迹序列预测模型 Improved Sequence-to-Sequence Model for Short-term Vessel Trajectory Prediction Using AIS Data Streams 计算机科学, 2020, 47(9): 169-174. https://doi.org/10.11896/jsjkx.190800060 | 
| [14] | 袁得嵛, 章逸钒, 高见, 孙海春. 基于用户特征提取的新浪微博异常用户检测方法 Abnormal User Detection Method in Sina Weibo Based on User Feature Extraction 计算机科学, 2020, 47(6A): 364-368. https://doi.org/10.11896/JsJkx.190700008 | 
| [15] | 张素梅, 张波涛. 一种基于量子耗散粒子群的评估模型构建方法 Evaluation Model Construction Method Based on Quantum Dissipative Particle Swarm Optimization 计算机科学, 2020, 47(6A): 84-88. https://doi.org/10.11896/JsJkx.190900148 | 
| 
 | ||