Computer Science ›› 2023, Vol. 50 ›› Issue (6A): 220600172-6.doi: 10.11896/jsjkx.220600172

• Information Security • Previous Articles     Next Articles

Study on SQL Injection Detection Based on FlexUDA Model

WANG Qingyu, WANG Hairui, ZHU Guifu, MENG Shunjian   

  1. Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:WANG Qingyu,born in 1995,postgra-duate.His main research interests include cyber security and machine lear-ning. WANG Hairui,born in 1969,Ph.D,professor,is a member of China Computer Federation.His main research interests include multimedia intelligence techno-logy,network control technology,and embedded application technology.
  • Supported by:
    National Natural Science Foundation of China(61863016,61263023).

Abstract: FlexUDA model based on semi-supervised learning is proposed to solve the problem that insufficient labeled data is easy to cause model over fitting when deep learning method detects SQL injection.Firstly,the collected data are preprocessed by decoding,generalization and word segmentation,and then the unlabeled data are augmented by calculating the TF-IDF value.The original data and augmented data are vectorized using TF-IDF and Word2Vec fusion algorithm.Finally,the FlexUDA model is used for training,and the trained model is compared with other models.Experimental results show the FlexUDA model only uses 1000 labeled data and 100000 unlabeled data for training,and achieves 99.42% accuracy and 99.23% recall.Compared with other supervised training models,it shows better generalization performance,and can well solve the over fitting problem caused by insufficient labeled data in SQL injection detection.

Key words: SQL injection detection, Semi-supervised learning, Unsupervised data augmentations, Dynamic threshold

CLC Number: 

  • TP393.08
[1]Top 10 Web Application Security Risks[EB/OL].https://owasp.org/www-project-top-ten.
[2]OWASP TOP 10 from 2003 to 2021 Releases[EB/OL].https://github.com/OWASP/Top10.
[3]WANG F.Research and implementation of SQL injection detection technology based on d-eep learning[D].Beijing:Beijing University of Posts and Telecommunications,2020.
[4]GOULD C,SU Z,DEVANBU P.Static checking of dynamically generated queries in database applications[C]//26th International Conference on Software Engineering.IEEE,2004:645-654.
[5]LIVSHITS V B,LAM M S.Finding Security Vulnerabilities in Java Applications with Static Analysis[C]//Proceedings of the 14th Conference on USENIX Security Symposium.2005:18.
[6]SHIN Y.Improving the identification of actual input manipula-tion vulnerabilities[C]//14th ACM SIGSOFT Symposium on Foundations of Software Engineering ACM.2006.
[7]DAS D,SHARMA U,BHATTACHARYYAD K.An Approach to Detection of SQL Injection Vulnerabilities Based on Dynamic Query Matching[J].International Journal of Computer Applications,2010,1(25):39-45.
[8]HALFOND W G J,ORSO A.AMNESIA:analysis and monito-ring for neutralizing SQL inj-ection attacks[C]//Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering.2005:174-183.
[9]XIAO Z,ZHOU Z,YANG W,et al.An approach for SQL injection detection based on behavior and response analysis[C]//2017 IEEE 9th International Conference on Communication Software and Networks(ICCSN).IEEE,2017:1437-1442.
[10]APPIAH B,OPOKU-MENSAH E,QIN Z.SQL injection attack detection using fingerprints and pattern matching technique[C]//2017 8th IEEE International Conference on Software Engineering and Service Science(ICSESS).IEEE,2017:583-587.
[11]WASSERMANN G,GOULD C,SU Z D,et al.Static Checking of Dynamically Generated Queries in Database Applications[J].ACM Transactions on Software Engineering and Methodology,2007,16(4):14.1-14.27.
[12]ISHITAKI T,OBUKATA R,ODAT,et al.Application of deep recurrent neural networks for prediction of user behavior in tor networks[C]//2017 31st International Conference on Advanced Information Networking and Applications Workshops(WAINA).IEEE,2017:238-243.
[13]ZHANG X Y.Research on patriotism in class-ical poetry based on textcnn[D].Shanghai:Shanghai Normal University,2020.
[14]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[15]CHEN Y.Convolutional neural networks for sentence classification[D].Waterloo:University of Waterloo,2015.
[16]XIE Q,DAI Z,HOVY E,et al.Unsupervised data augmentation for consistency training[J].Advances in Neural Information Processing Systems,2020,33:6256-6268.
[17]ZHANG B,WANG Y,HOU W,et al.Flexmatch:Boostingsemi-supervised learning with curriculum pseudo labeling[J].Advances in Neural Information Processing Systems, 2021,34:18408-18419.
[18]SQL injection dataset[EB/OL].[https://github.com/client9/libinjection.
[19]JOSHI A,GEETHA V.SQL Injection detection using machine learning[C]//2014 International Conference on Control,Instrumentation,Communication and Computational Technologies(ICCICCT).IEEE,2014:1111-1115.
[20]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estima-tion of word representations in vector space[J].arXiv:1301.3781,2013.
[21]LI C.Research on SQL injection detection technology based on Naive Bayes and LST-M recurrent neural network[D].Changsha:Hunan University,2018.
[22]CAO X B.Research on SQL injection detect-ion based on deep learning[D].Nanning:Guangxi University,2020.
[1] LI Hui, LI Wengen, GUAN Jihong. Dually Encoded Semi-supervised Anomaly Detection [J]. Computer Science, 2023, 50(7): 53-59.
[2] GU Yuhang, HAO Jie, CHEN Bing. Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion [J]. Computer Science, 2023, 50(6A): 220500001-6.
[3] QIN Liang, XIE Liang, CHEN Shengshuang, XU Haijiao. Online Semi-supervised Cross-modal Hashing Based on Anchor Graph Classification [J]. Computer Science, 2023, 50(6): 183-193.
[4] ZHANG Renbin, ZUO Yicong, ZHOU Zelin, WANG Long, CUI Yuhang. Multimodal Generative Adversarial Networks Based Multivariate Time Series Anomaly Detection [J]. Computer Science, 2023, 50(5): 355-362.
[5] LI Haitao, WANG Ruimin, DONG Weiyu, JIANG Liehui. Semi-supervised Network Traffic Anomaly Detection Method Based on GRU [J]. Computer Science, 2023, 50(3): 380-390.
[6] WANG Xiangwei, HAN Rui, Chi Harold LIU. Hierarchical Memory Pool Based Edge Semi-supervised Continual Learning Method [J]. Computer Science, 2023, 50(2): 23-31.
[7] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[8] HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118.
[9] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[10] XU Hua-jie, CHEN Yu, YANG Yang, QIN Yuan-zhuo. Semi-supervised Learning Method Based on Automated Mixed Sample Data Augmentation Techniques [J]. Computer Science, 2022, 49(3): 288-293.
[11] QIN Yue, DING Shi-fei. Survey of Semi-supervised Clustering [J]. Computer Science, 2019, 46(9): 15-21.
[12] WU Zhen-yu, LI Yun-lei, WU Fan. Semi-supervised Support Tensor Based on Tucker Decomposition [J]. Computer Science, 2019, 46(9): 195-200.
[13] SONG Xin,ZHU Zong-liang,GAO Yin-ping,CHANG Dao-fang. Vessel AIS Trajectory Online Compression Algorithm Combining Dynamic Thresholding and Global Optimization [J]. Computer Science, 2019, 46(7): 333-338.
[14] SHEN Hong, LIU Jun-fa, CHEN Yi-qiang, JIANG Xin-long, HUANG Zheng-yu. Semi-supervised Scene Recognition Method Based on Multi-mode Fusion [J]. Computer Science, 2019, 46(12): 306-312.
[15] YU Ying, CHEN Ke, SHOU Li-dan, CHEN Gang, WU Xiao-fan. Sentiment Analysis of User Comments Based on Extraction of Key Words and Key Sentences [J]. Computer Science, 2019, 46(10): 19-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!