计算机科学 ›› 2018, Vol. 45 ›› Issue (3): 124-130.doi: 10.11896/j.issn.1002-137X.2018.03.020
汪鑫,武杨,卢志刚
WANG Xin, WU Yang and LU Zhi-gang
摘要: 互联网应用已经渗透到人们日常生活的方方面面,恶意URL防不胜防,给人们的财产和隐私带来了严重威胁。当前主流的防御方法主要依靠黑名单机制, 难以检测 黑名单以外的URL。因此,引入机器学习来优化恶意URL检测是一个主要的研究方向,但其主要受限于URL的短文本特性,导致提取的特征单一,从而使得检测效果较差。针对上述挑战,设计了一个基于威胁情报平台的恶意URL检测系统。该系统针对URL字符串提取了结构特征、情报特征和敏感词特征3类特征来训练分类器,然后采用多分类器投票机制来判断类别,并实现威胁情报的自动更新。实验结果表明,该方法对恶意URL进行检测 的准确率 达到了96%以上。
[1] CNNIC.Statistical Report on Internet Development in China[EB/OL].http://www.cnnic.net.cn/hlwfzyj.. [2] Kaspersky Lab.KASPERSKY SECURITY BULLETIN 2015[EB/OL].http://www.gartner.com/doc/2487216/definition-threat-intelligence. [3] RAHBARINIA B,BALDUZZI M,PERDISCI R.Real-Time Detection of Malware Downloads via Large-Scale URL-> File->Machine Graph Mining[C]∥Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security.ACM,2016:783-794. [4] ZHOU Z,SONG T,JIA Y.A high-performance url lookup engine for url filtering systems[C]∥2010 IEEE International Conference on Communications (ICC).IEEE,2010:1-5. [5] PRIYA M,SANDHYA L,THOMAS C.A static approach to detect drive-by-download attacks on webpages[C]∥2013 International Conference on Control Communication and Computing (ICCC).IEEE,2013:298-303. [6] HEYMANN P,KOUTRIKA G,GARCIA-MOLINA H.Fighting spam on social web sites:A survey of approaches and future challenges[J].IEEE Internet Computing,2007,11(6):36-45. [7] SHA H Z,LIU Q Y,LIU T W,et al.Survey on Malicious Webpage Detection Research[J].Chinese Journal of Computers,2016,39(3):529-542.(in Chinese) 沙泓州,刘庆云,柳厅文,等.恶意网页识别研究综述[J].计算机学报,2016,39(3):529-542. [8] LIANG B,HUANG J,LIU F,et al.Malicious Web Pages Detection Based on Abnormal Visibility Recognition[C]∥2009 International Conference on E-Business and Information System Security.IEEE,2009:1-5. [9] LI Z,ALRWAIS S,XIE Y,et al.Finding the linchpins of thedark web:a study on topologically dedicated hosts on malicious web infrastructures[C]∥2013 IEEE Symposium on Security and Privacy (SP).IEEE,2013:112-126. [10] GARERA S,PROVOS N,CHEW M,et al.A framework for detection and measurement of phishing attacks[C]∥Proceedings of the 2007 ACM Workshop on Recurring Malcode.ACM,2007:1-8. [11] MA J,SAUL L K,SAVAGE S,et al.Identifying suspiciousURLs:an application of large-scale online learning[C]∥Proceedings of the 26th Annual International Conference on Machine Learning.ACM,2009:681-688. [12] MA J,SAUL L K,SAVAGE S,et al.Beyond blacklists:learning to detect malicious web sites from suspicious URLs[C]∥Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2009:1245-1254. [13] MA J,KULESZA A,DREDZE M,et al.Exploiting Feature Covariance in High-Dimensional Online Learning[C]∥AISTATS.2010:493-500. [14] LIN H L,LI Y,WANG W P,et al.Efficient segment pattern based method for malicious URL detection[J].Journal on Communications,2015,36(Z1):141-148.(in Chinese) 林海伦,李焱,王伟平,等.高效的基于段模式的恶意 URL 检测方法[J].通信学报,2015,36(Z1):141-148. [15] YANG Z M,LI Q,LIU J R,et al.Research of Threat Intelligence Sharing and Using for Cyber Attack Attribution[J].Journal of Information Securyity Research,2015,1(1):31-36.(in Chinese) 杨泽明,李强,刘俊荣,等.面向攻击溯源的威胁情报共享利用研究[J].信息安全研究,2015,1(1):31-36. [16] SAMTANI S,CHINN K,LARSON C,et al.AZSecure Hacker Assets Portal:Cyber threat intelligence and malware analysis[C]∥2016 IEEE Conference on Intelligence and Security Informatics (ISI).IEEE,2016:19-24. [17] AHREND J M,JIROTKA M,JONES K.On the collaborative practices of cyber threat intelligence analysts to develop and utilize tacit Threat and Defence Knowledge[C]∥2016 InternationalConference on Cyber Situational Awareness,Data Analytics And Assessment (CyberSA).IEEE,2016:1-10. [18] DAI W,JI W.A mapreduce implementation of C4.5 decision tree algorithm[J].International Journal of Database Theory and Application,2014,7(1):49-60. [19] PATIL T R,SHEREKAR S S.Performance analysis of Naive Bayes and J48 classification algorithm for data classification[J].International Journal of Computer Science and Applications,2013,6(2):256-261. [20] PAN W,CHEN G.A method of off-line signature verificationfor digital forensics[C]∥2016 12th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).IEEE,2016:488-493. [21] VLADIMIR V N,VAPNIK V.The nature of statistical learning theory[M].New York:Springer-verlag,1995:988-999. [22] CRAMMER K,DREDZE M,PEREIRA F.Exact convex confidence-weighted learning[C]∥Advances in Neural Information Processing Systems.2009:345-352. [23] HOI S C H,WANG J,ZHAO P.LIBOL:A Library for Online Learning Algorithms[J].Journal of Machine Learning Research,2014,15(1):495-499. |
No related articles found! |
|