Computer Science ›› 2018, Vol. 45 ›› Issue (3): 124-130.doi: 10.11896/j.issn.1002-137X.2018.03.020

Previous Articles     Next Articles

Study on Malicious URL Detection Based on Threat Intelligence Platform

WANG Xin, WU Yang and LU Zhi-gang   

  • Online:2018-03-15 Published:2018-11-13

Abstract: With Internet penetrating into daily life,it is hard to prevent ubiquitous malicious URLs,threatening the properties and privacies of people seriously.Traditional method to detect malicious URL relies on blacklist mechanism,but it can do nothing with the malicious URLs which are not in the list.Therefore,one of the fundamental directions is bringing in machine learning to optimize the malicious URL detection.However,the results of most existing solutions are not satisfying,as the characteristics of URL short text make it extract a single feature.To address those problems above,this paper designed a novel system to detect malicious URLs based on threat intelligence platform.The system extracts structural features,intelligence features and sensitive lexical features to train classifiers.Next,the voting me-chanism with results of multiple classifiers is exploited to determine the type of URLs.Finally,the threat intelligence can be updated automatically.The experimental results show that the method for detecting malicious URL has good de-tection effect,and is capable of achieving classification accuracy up to 96%.

Key words: Malicious URL,Threat intelligence,Classifier,Voting mechanism

[1] CNNIC.Statistical Report on Internet Development in China[EB/OL].http://www.cnnic.net.cn/hlwfzyj..
[2] Kaspersky Lab.KASPERSKY SECURITY BULLETIN 2015[EB/OL].http://www.gartner.com/doc/2487216/definition-threat-intelligence.
[3] RAHBARINIA B,BALDUZZI M,PERDISCI R.Real-Time Detection of Malware Downloads via Large-Scale URL-> File->Machine Graph Mining[C]∥Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security.ACM,2016:783-794.
[4] ZHOU Z,SONG T,JIA Y.A high-performance url lookup engine for url filtering systems[C]∥2010 IEEE International Conference on Communications (ICC).IEEE,2010:1-5.
[5] PRIYA M,SANDHYA L,THOMAS C.A static approach to detect drive-by-download attacks on webpages[C]∥2013 International Conference on Control Communication and Computing (ICCC).IEEE,2013:298-303.
[6] HEYMANN P,KOUTRIKA G,GARCIA-MOLINA H.Fighting spam on social web sites:A survey of approaches and future challenges[J].IEEE Internet Computing,2007,11(6):36-45.
[7] SHA H Z,LIU Q Y,LIU T W,et al.Survey on Malicious Webpage Detection Research[J].Chinese Journal of Computers,2016,39(3):529-542.(in Chinese) 沙泓州,刘庆云,柳厅文,等.恶意网页识别研究综述[J].计算机学报,2016,39(3):529-542.
[8] LIANG B,HUANG J,LIU F,et al.Malicious Web Pages Detection Based on Abnormal Visibility Recognition[C]∥2009 International Conference on E-Business and Information System Security.IEEE,2009:1-5.
[9] LI Z,ALRWAIS S,XIE Y,et al.Finding the linchpins of thedark web:a study on topologically dedicated hosts on malicious web infrastructures[C]∥2013 IEEE Symposium on Security and Privacy (SP).IEEE,2013:112-126.
[10] GARERA S,PROVOS N,CHEW M,et al.A framework for detection and measurement of phishing attacks[C]∥Proceedings of the 2007 ACM Workshop on Recurring Malcode.ACM,2007:1-8.
[11] MA J,SAUL L K,SAVAGE S,et al.Identifying suspiciousURLs:an application of large-scale online learning[C]∥Proceedings of the 26th Annual International Conference on Machine Learning.ACM,2009:681-688.
[12] MA J,SAUL L K,SAVAGE S,et al.Beyond blacklists:learning to detect malicious web sites from suspicious URLs[C]∥Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2009:1245-1254.
[13] MA J,KULESZA A,DREDZE M,et al.Exploiting Feature Covariance in High-Dimensional Online Learning[C]∥AISTATS.2010:493-500.
[14] LIN H L,LI Y,WANG W P,et al.Efficient segment pattern based method for malicious URL detection[J].Journal on Communications,2015,36(Z1):141-148.(in Chinese) 林海伦,李焱,王伟平,等.高效的基于段模式的恶意 URL 检测方法[J].通信学报,2015,36(Z1):141-148.
[15] YANG Z M,LI Q,LIU J R,et al.Research of Threat Intelligence Sharing and Using for Cyber Attack Attribution[J].Journal of Information Securyity Research,2015,1(1):31-36.(in Chinese) 杨泽明,李强,刘俊荣,等.面向攻击溯源的威胁情报共享利用研究[J].信息安全研究,2015,1(1):31-36.
[16] SAMTANI S,CHINN K,LARSON C,et al.AZSecure Hacker Assets Portal:Cyber threat intelligence and malware analysis[C]∥2016 IEEE Conference on Intelligence and Security Informatics (ISI).IEEE,2016:19-24.
[17] AHREND J M,JIROTKA M,JONES K.On the collaborative practices of cyber threat intelligence analysts to develop and utilize tacit Threat and Defence Knowledge[C]∥2016 InternationalConference on Cyber Situational Awareness,Data Analytics And Assessment (CyberSA).IEEE,2016:1-10.
[18] DAI W,JI W.A mapreduce implementation of C4.5 decision tree algorithm[J].International Journal of Database Theory and Application,2014,7(1):49-60.
[19] PATIL T R,SHEREKAR S S.Performance analysis of Naive Bayes and J48 classification algorithm for data classification[J].International Journal of Computer Science and Applications,2013,6(2):256-261.
[20] PAN W,CHEN G.A method of off-line signature verificationfor digital forensics[C]∥2016 12th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).IEEE,2016:488-493.
[21] VLADIMIR V N,VAPNIK V.The nature of statistical learning theory[M].New York:Springer-verlag,1995:988-999.
[22] CRAMMER K,DREDZE M,PEREIRA F.Exact convex confidence-weighted learning[C]∥Advances in Neural Information Processing Systems.2009:345-352.
[23] HOI S C H,WANG J,ZHAO P.LIBOL:A Library for Online Learning Algorithms[J].Journal of Machine Learning Research,2014,15(1):495-499.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!