基于语义变化的缺陷生成与缺陷预测模型测试

doi:10.11896/jsjkx.241200059

Abstract

Abstract: In recent years,machine learning techniques have made significant advancements in defect prediction within software development,enabling the automatic detection of errors in large-scale codebases.These advancements are expected to enhance the reliability,security,and overall quality of software.Defect prediction models can autonomously identify whether code contains errors.However,existing models,while having certain advantages,also exhibit limitations.They often fail to accurately identify vulnerabilities or incorrectly label defective code segments as problem-free.Currently,there is a lack of systematic empirical studies on the quality of defect detection models.The existing method,DPTester,assesses the effectiveness of defect models by generating defective code through modifications to if conditions in the code.However,the defect code produced by this method is overly simplistic,and the evaluation scenarios do not cover a wide range of models,including the latest large language models.To address this gap,this paper proposes an improved method called DefectGen.This new approach introduces multiple strategies to generate defect code that more closely reflects real-world issues.Furthermore,the evaluation of defect models includes large language mo-dels.Experimental results indicate that DefectGen significantly enhances the ability to generate complex defect code compared to previous methods,producing 1.2 times more defective code from a single correct code instance.When testing the CodeT5+,CodeBERT,and GPT-4o models,the proportions of incorrect defect predictions were found to be 62%,78%,and 30%.Additionally,DefectGen demonstrates higher efficiency in both test input generation and defect detection phases,with generation and detection times of 0.003 seconds and 0.02 seconds per test input.These results suggest that DefectGen not only effectively exposes the limitations of existing models but also provides new opportunities for improving defect prediction models and enhancing software quality assurance processes.

Key words: Defect prediction, Machine learning, Large language models

CLC Number:

TP311.53

GUO Liwei, WU Yonghao, LIU Yong. Semantic Variations Based Defect Generation and Prediction Model Testing[J].Computer Science, 2025, 52(11A): 241200059-7.

References

[1]CHEN J,HU K,YU Y,et al.Software Visualization and Deep Transfer Learning for Effective Software Defect Prediction[C]//2020 IEEE/ACM 42nd International Conference on Software Engineering(ICSE).2020:578-589.
[2]WANG S,LIU T,TAN L.Automatically learning semantic features for defect prediction[C]//Proceedings of the 38th International Conference on Software Engineering.New York:Association for Computing Machinery,2016:297-308.
[3]LIANG H,YU Y,JIANG L,et al.Seml:A Semantic LSTMModel for Software Defect Prediction[J].IEEE Access,2019,7:83812-83824.
[4]GIRAY G,BENNIN K E,KÖKSAL Ö,et al.On the use of deep learning in software defect prediction[J].Journal of Systems and Software,2023,195:111537.
[5]FENTON N E,NEIL M.A critique of software defect prediction models[J].IEEE Transactions on Software Engineering,1999,25(5):675-689.
[6]CARSON J S.Model verification and validation[C]//Procee-dings of the Winter Simulation Conference:2002:52-58.
[7]XU F,SUN Z.Defect-Introducing Defect Prediction Testing[C]//2024 IEEE 24th International Conference on Software Quality,Reliability,and Security Companion(QRS-C).2024:401-410.
[8]ZHU Q,SUN Z,XIAO Y,et al.A syntax-guided edit decoder for neural program repair[C]//Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.New York:Association for Computing Machinery,2021:341-353.
[9]YEFET N,ALON U,YAHAV E.Adversarial examples formodels of code[J].Proc.ACM Program.Lang.,2020,4(OOPSLA):162:1-162:30.
[10]FENG Z,GUO D,TANG D,et al.CodeBERT:A Pre-TrainedModel for Programming and Natural Languages[C]//Findings of the Association for Computational Linguistics(EMNLP 2020).Association for Computational Linguistics,2020:1536-1547.
[11]WANG Y,WANG W,JOTY S,et al.CodeT5:Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation[C]//MOENS M F,HUANG X,SPECIA L,et al.Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Online and Punta Cana,Dominican Republic:Association for Computational Linguistics,2021:8696-8708.
[12]WANG Y,LE H,GOTMARE A,et al.CodeT5+:Open Code Large Language Models for Code Understanding and Generation[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.Singapore:Association for Computational Linguistics,2023:1069-1088.
[13]ZHANG H,LI Z,LI G,et al.Generating Adversarial Examples for Holding Robustness of Source Code Processing Models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:1169-1176.
[14]POUR M V,LI Z,MA L,et al.A Search-Based Testing Framework for Deep Neural Networks of Source Code Embedding[C]//2021 14th IEEE Conference on Software Testing,Verification and Validation(ICST).2021:36-46.
[15]HENKEL J,RAMAKRISHNAN G,WANG Z,et al.SemanticRobustness of Models of Source Code[C]//2022 IEEE International Conference on Software Analysis,Evolution and Reengineering(SANER).2022:526-537.
[16]JHA A,REDDY C K.CodeAttack:Code-Based Adversarial Attacks for Pre-trained Programming Language Models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:14892-14900.
[17]TIAN Z,CHEN J,JIN Z.Code Difference Guided AdversarialExample Generation for Deep Code Models[C]//2023 38th IEEE/ACM International Conference on Automated Software Engineering(ASE).2023:850-862.
[18]LI Z,WANG C,LIU Z,et al.CCTEST:Testing and Repairing Code Completion Systems[C]//2023 IEEE/ACM 45th International Conference on Software Engineering(ICSE).2023:1238-1250.
[19]FENG Z,GUO D,TANG D,et al.CodeBERT:A Pre-TrainedModel for Programming and Natural Languages[C]//Findings of the Association for Computational Linguistics(EMNLP 2020).Association for Computational Linguistics,2020:1536-1547.

Related Articles 15

[1]	WANG Yongquan, SU Mengqi, SHI Qinglei, MA Yining, SUN Yangfan, WANG Changmiao, WANG Guoyou, XI Xiaoming, YIN Yilong, WAN Xiang. Research Progress of Machine Learning in Diagnosis and Treatment of Esophageal Cancer [J]. Computer Science, 2025, 52(9): 4-15.
[2]	LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211.
[3]	JIANG Rui, FAN Shuwen, WANG Xiaoming, XU Youyun. Clustering Algorithm Based on Improved SOM Model [J]. Computer Science, 2025, 52(8): 162-170.
[4]	LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247.
[5]	CHEN Jinyin, XI Changkun, ZHENG Haibin, GAO Ming, ZHANG Tianxin. Survey of Security Research on Multimodal Large Language Models [J]. Computer Science, 2025, 52(7): 315-341.
[6]	LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7.
[7]	HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4.
[8]	YANG Jixiang, JIANG Huiping, WANG Sen, MA Xuan. Research Progress and Challenges in Forest Fire Risk Prediction [J]. Computer Science, 2025, 52(6A): 240400177-8.
[9]	QIAO Yu, XU Tao, ZHANG Ya, WEN Fengpeng, LI Qiangwei. Graph Neural Network Defect Prediction Method Combined with Developer Dependencies [J]. Computer Science, 2025, 52(6): 52-57.
[10]	GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329.
[11]	WU Xingli, ZHANG Haoyue, LIAO Huchang. Review of Doctor Recommendation Methods and Applications for Consultation Platforms [J]. Computer Science, 2025, 52(5): 109-121.
[12]	JIAO Jian, CHEN Ruixiang, HE Qiang, QU Kaiyang, ZHANG Ziyi. Study on Smart Contract Vulnerability Repair Based on T5 Model [J]. Computer Science, 2025, 52(4): 362-368.
[13]	HAN Lin, WANG Yifan, LI Jianan, GAO Wei. Automatic Scheduling Search Optimization Method Based on TVM [J]. Computer Science, 2025, 52(3): 268-276.
[14]	XIONG Qibing, MIAO Qiguang, YANG Tian, YUAN Benzheng, FEI Yangyang. Malicious Code Detection Method Based on Hybrid Quantum Convolutional Neural Network [J]. Computer Science, 2025, 52(3): 385-390.
[15]	ZUO Xuhong, WANG Yongquan, QIU Geping. Study on Integrated Model of Securities Illegal Margin Trading Accounts Identification Based on Trading Behavior Characteristics [J]. Computer Science, 2025, 52(2): 125-133.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Semantic Variations Based Defect Generation and Prediction Model Testing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0