Computer Science ›› 2026, Vol. 53 ›› Issue (3): 424-432.doi: 10.11896/jsjkx.250200124

• Information Security • Previous Articles     Next Articles

Dual-channel Source Code Vulnerability Detection Model Based on Contrastive Learning

SONG Jianhua1,3,4,5, HE Jiawei1, ZHANG Yan2,3,5   

  1. 1 School of Cyber Science and Technology, Hubei University, Wuhan 430062, China
    2 School of Computer Science, Hubei University, Wuhan 430062, China
    3 Key Laboratory of Intelligent Sensing System and Security(Hubei University), Ministry of Education, Wuhan 430062, China
    4 Hubei Provincial Engineering Research Center of Intelligent Connected Vehicle Network Security, Wuhan 430062, China
    5 Hubei Key Laboratory of Big Data Intelligent Analysis and Application, Hubei University, Wuhan 430062, China
  • Received:2025-02-27 Revised:2025-05-23 Published:2026-03-12
  • About author:SONG Jianhua,born in 1973,Ph.D,professor,master’s supervisor,is a member of CCF(No.27785M).Her main research interests include network and information security and so on.
    HE Jiawei,born in 2001,postgraduate.His main research interests include source code vulnerability detection and so on.
  • Supported by:
    National Natural Science Foundation of China(62377009),Major Project of Hubei Province(JD)(2023BAA018),Key Project of Hubei Provincial Key R & D Program(2021BAA184, 2021BAA188),Research Center for Performance Evaluation and Information Management of Key Research Bases for Humanities and Social Sciences in Hubei Provincial Colleges and Universities(2020JX01) and Major Science and Technology Special Project of Hubei Science and Technology Plan(2024BAA008).

Abstract: As software vulnerabilities continue to increase,system security is facing severe challenges.Source code vulnerability detection can identify potential security threats in software applications during the development phase,which is crucial for ensuring the security of software applications.Currently,the mainstream method for source code vulnerability detection is based on deep learning models.However,many existing deep learning models rely only on a single form of features and fail to fully explore both the global and local information in the source code semantics.Additionally,these models often overlook the differences and similarities between different samples,leading to poor performance when handling complex vulnerability patterns,with high false positive and false negative rates.To address these issues,a dual-channel source code vulnerability detection model based on con-trastive learning is proposed.This model uses different channels to separately extract global and local features from the source code semantics and introduces contrastive learning to allow the model to learn the similarities and differences between different samples,thereby optimizing the feature extraction process.Experimental results show that this model shows significant improvements in recall and F1 score on the real-world vulnerability datasets,Devign and Reveal,compared to the baseline models.The average improvement is 14.65 percentage points and 6.30 percentage points on Devign,and 31.18 percentage points and 22.44 percentage points on Reveal.

Key words: Source code vulnerability detection, Dual channel network model, Comparative learning, Cross attention, Feature fusion

CLC Number: 

  • TP311
[1]Skybox Security.Vulnerability & Threat Trends Report 2023[EB/OL].[2024-11-18].https://www.skyboxsecurity.com/resources/report/vulnerability-threat-trends-report-2023/.
[2]SU X H,ZHENG W L,JIANG Y,et al.Research and progress on learning-base source code vulnerability detection[J].Journal of Computers,2024,47(2):337-374.
[3]CHAKRABORTY S,KRISHNA R,DING Y,et al.Deep lear-ning based vulnerability detection:Are we there yet?[J].IEEE Transactions on Software Engineering,2021,48(9):3280-3296.
[4]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//International Conference on Machine Learning.PMLR,2020:1597-1607.
[5]LIN H,CHENG X,WU X,et al.Cat:Cross attention in vision transformer[C]//2022 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2022:1-6.
[6]FENG Z,GUO D,TANG D,et al.Codebert:A pre-trainedmodel for programming and natural languages[J].arXiv:2002.08155,2020.
[7]LIU Y.Roberta:A robustly optimized bert pretraining approach[J].arXiv:1907.11692,2019.
[8]GUO D,REN S,LU S,et al.Graphcodebert:Pre-training coderepresentations with data flow[J].arXiv:2009.08366,2020.
[9]RUSSELL R,KIM L,HAMILTON L,et al.Automated vulnerability detection in source code using deep representation lear-ning[C]//2018 17th IEEE International Conference on Machine Learning and Applications(ICMLA).IEEE,2018:757-762.
[10]LI Z,ZOU D,XU S,et al.Vuldeepecker:A deep learning-based system for vulnerability detection[J].arXiv:1801.01681,2018.
[11]HANIF H,MAFFEIS S.Vulberta:Simplified source code pre-training for vulnerability detection[C]//2022 International Joint Conference on Neural Networks(IJCNN).IEEE,2022:1-8.
[12]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[13]ZHOU Y,LIU S,SIOW J,et al.Devign:Effective vulnerability identification by learning comprehensive program semantics via graph neural networks[C]//Advances in neural Information Processing Systems.2019.
[14]NGUYEN V A,NGUYEN D Q,NGUYEN V,et al.ReGVD:Revisiting graph neural networks for vulnerability detection[C]//Proceedings of the ACM/IEEE 44th International Confe-rence on Software Engineering:Companion Proceedings.2022:178-182.
[15]LING M,TANG M,BIAN D,et al.A dual graph neural networks model using sequence embedding as graph nodes for vulnerability detection[J].Information and Software Technology,2025,177:107581.
[16]JAIN P,JAIN A,ZHANG T,et al.Contrastive code representation learning[J].arXiv:2007.04973,2020.
[17]NEELAKANTAN A,XU T,PURI R,et al.Text and code embeddings by contrastive pre-training[J].arXiv:2201.10005,2022.
[18]LIU S,WU B,XIE X,et al.Contrabert:Enhancing code pre-trained models via contrastive learning[C]//2023 IEEE/ACM 45th International Conference on Software Engineering(ICSE).IEEE,2023:2476-2487.
[19]CHEN Y,SUN Z,GONG Z,et al.Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection[C]//Proceedings of the IEEE/ACM 46th International Confe-rence on Software Engineering.2024:1-11.
[20]LLVM Team.libclang:C Interface to Clang[EB/OL].[2024-11-18].https://clang.llvm.org/doxygen/group__CINDEX.html.
[21]GAGE P.A new algorithm for data compression[J].The CUsers Journal,1994,12(2):23-38.
[22]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[23]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[1] HUANG Jing, WANG Teng, LIU Jian, HU Kai, PENG Xin, HUANG Yamin, WEN Yuanqiao. Multimodal Visual Detection for Underwater Sonar Target Images [J]. Computer Science, 2026, 53(2): 227-235.
[2] JI Sai, QIAO Liwei, SUN Yajie. Semantic-guided Hybrid Cross-feature Fusion Method for Infrared and Visible Light Images [J]. Computer Science, 2026, 53(2): 253-263.
[3] LIU Chenhong, LI Fenglian, YANG Jia, WANG Suzhe, CHEN Guijun. Boundary-focused Multi-scale Feature Fusion Network for Stroke Lesion Segmentation [J]. Computer Science, 2026, 53(2): 264-272.
[4] FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion [J]. Computer Science, 2026, 53(1): 206-215.
[5] DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni. Collaborative Semantics Fusion for Multi-agent Behavior Decision-making [J]. Computer Science, 2026, 53(1): 252-261.
[6] ZHANG Xiaomin, ZHAO Junzhi, HE Hongjie. Screen-shooting Resilient Watermarking Method for Document Image Based on Attention Mechanism [J]. Computer Science, 2026, 53(1): 413-422.
[7] LIN Heng, JI Qingge. Panoramic Image Quality Assessment Method Integrating Salient Viewport Extraction andCross-layer Attention [J]. Computer Science, 2025, 52(9): 249-258.
[8] ZENG Lili, XIA Jianan, LI Shaowen, JING Maike, ZHAO Huihui, ZHOU Xuezhong. M2T-Net:Cross-task Transfer Learning Tongue Diagnosis Method Based on Multi-source Data [J]. Computer Science, 2025, 52(9): 47-53.
[9] LUO Chi, LU Lingyun, LIU Fei. Partial Differential Equation Solving Method Based on Locally Enhanced Fourier NeuralOperators [J]. Computer Science, 2025, 52(9): 144-151.
[10] GUO Husheng, ZHANG Xufei, SUN Yujie, WANG Wenjian. Continuously Evolution Streaming Graph Neural Network [J]. Computer Science, 2025, 52(8): 118-126.
[11] LUO Xuyang, TAN Zhiyi. Knowledge-aware Graph Refinement Network for Recommendation [J]. Computer Science, 2025, 52(7): 103-109.
[12] LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150.
[13] XU Yongwei, REN Haopan, WANG Pengfei. Object Detection Algorithm Based on YOLOv8 Enhancement and Its Application Norms [J]. Computer Science, 2025, 52(7): 189-200.
[14] FANG Chunying, HE Yuankun, WU Anxin. Emotion Recognition Based on Brain Network Connectivity and EEG Microstates [J]. Computer Science, 2025, 52(7): 201-209.
[15] XU Yutao, TANG Shouguo. Visual Question Answering Integrating Visual Common Sense Features and Gated Counting Module [J]. Computer Science, 2025, 52(6A): 240800086-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!