计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220600172-6.doi: 10.11896/jsjkx.220600172
王清宇, 王海瑞, 朱贵富, 孟顺建
WANG Qingyu, WANG Hairui, ZHU Guifu, MENG Shunjian
摘要: 针对深度学习方法检测SQL注入时有标签数据不足容易导致模型过拟合的问题,提出了一种基于半监督学习的FlexUDA模型。首先对采集到的数据进行解码、泛化和分词等预处理,然后通过计算TF-IDF值对无标签数据进行增强,并将原始数据和增强后的数据使用TF-IDF和Word2Vec融合算法进行向量化,最后使用FlexUDA模型进行训练,并将训练好的模型与其他模型进行对比分析。实验结果表明,FlexUDA模型仅使用1000条有标签数据和100000条无标签数据进行训练,就获得了99.42%的准确率和99.23%的召回率,相比其他有监督训练模型,表现出了更好的泛化性能,可以很好地解决SQL注入检测中有标签数据不足导致的过拟合问题。
中图分类号:
[1]Top 10 Web Application Security Risks[EB/OL].https://owasp.org/www-project-top-ten. [2]OWASP TOP 10 from 2003 to 2021 Releases[EB/OL].https://github.com/OWASP/Top10. [3]WANG F.Research and implementation of SQL injection detection technology based on d-eep learning[D].Beijing:Beijing University of Posts and Telecommunications,2020. [4]GOULD C,SU Z,DEVANBU P.Static checking of dynamically generated queries in database applications[C]//26th International Conference on Software Engineering.IEEE,2004:645-654. [5]LIVSHITS V B,LAM M S.Finding Security Vulnerabilities in Java Applications with Static Analysis[C]//Proceedings of the 14th Conference on USENIX Security Symposium.2005:18. [6]SHIN Y.Improving the identification of actual input manipula-tion vulnerabilities[C]//14th ACM SIGSOFT Symposium on Foundations of Software Engineering ACM.2006. [7]DAS D,SHARMA U,BHATTACHARYYAD K.An Approach to Detection of SQL Injection Vulnerabilities Based on Dynamic Query Matching[J].International Journal of Computer Applications,2010,1(25):39-45. [8]HALFOND W G J,ORSO A.AMNESIA:analysis and monito-ring for neutralizing SQL inj-ection attacks[C]//Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering.2005:174-183. [9]XIAO Z,ZHOU Z,YANG W,et al.An approach for SQL injection detection based on behavior and response analysis[C]//2017 IEEE 9th International Conference on Communication Software and Networks(ICCSN).IEEE,2017:1437-1442. [10]APPIAH B,OPOKU-MENSAH E,QIN Z.SQL injection attack detection using fingerprints and pattern matching technique[C]//2017 8th IEEE International Conference on Software Engineering and Service Science(ICSESS).IEEE,2017:583-587. [11]WASSERMANN G,GOULD C,SU Z D,et al.Static Checking of Dynamically Generated Queries in Database Applications[J].ACM Transactions on Software Engineering and Methodology,2007,16(4):14.1-14.27. [12]ISHITAKI T,OBUKATA R,ODAT,et al.Application of deep recurrent neural networks for prediction of user behavior in tor networks[C]//2017 31st International Conference on Advanced Information Networking and Applications Workshops(WAINA).IEEE,2017:238-243. [13]ZHANG X Y.Research on patriotism in class-ical poetry based on textcnn[D].Shanghai:Shanghai Normal University,2020. [14]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444. [15]CHEN Y.Convolutional neural networks for sentence classification[D].Waterloo:University of Waterloo,2015. [16]XIE Q,DAI Z,HOVY E,et al.Unsupervised data augmentation for consistency training[J].Advances in Neural Information Processing Systems,2020,33:6256-6268. [17]ZHANG B,WANG Y,HOU W,et al.Flexmatch:Boostingsemi-supervised learning with curriculum pseudo labeling[J].Advances in Neural Information Processing Systems, 2021,34:18408-18419. [18]SQL injection dataset[EB/OL].[https://github.com/client9/libinjection. [19]JOSHI A,GEETHA V.SQL Injection detection using machine learning[C]//2014 International Conference on Control,Instrumentation,Communication and Computational Technologies(ICCICCT).IEEE,2014:1111-1115. [20]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estima-tion of word representations in vector space[J].arXiv:1301.3781,2013. [21]LI C.Research on SQL injection detection technology based on Naive Bayes and LST-M recurrent neural network[D].Changsha:Hunan University,2018. [22]CAO X B.Research on SQL injection detect-ion based on deep learning[D].Nanning:Guangxi University,2020. |
|