计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 52-57.doi: 10.11896/jsjkx.240700119

• 计算机软件 • 上一篇    下一篇

结合开发者依赖的图神经网络缺陷预测方法

乔羽1, 徐涛2, 张亚1, 文凤鹏1, 李强伟1   

  1. 1 南京航空航天大学计算机科学与技术学院 南京 211106
    2 枣庄学院网络中心 山东 枣庄 277015
  • 收稿日期:2024-07-18 修回日期:2024-09-02 出版日期:2025-06-15 发布日期:2025-06-11
  • 通讯作者: 徐涛(xutao@uzz.edu.cn)
  • 作者简介:(cklqiaoyu@126.com)
  • 基金资助:
    国家自然科学基金(62202223);江苏省自然科学基金(BK20220881);工信部安全关键软件重点实验室(南京航空航天大学)开放项目(NJ2022027)

Graph Neural Network Defect Prediction Method Combined with Developer Dependencies

QIAO Yu1, XU Tao2, ZHANG Ya1, WEN Fengpeng1, LI Qiangwei1   

  1. 1 Department of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
    2 Network Center,Zaozhuang University,Zaozhuang,Shandong 277015,China
  • Received:2024-07-18 Revised:2024-09-02 Online:2025-06-15 Published:2025-06-11
  • About author:QIAO Yu,born in 1999,postgraduate,is a member of CCF(No.R4417G).His main research interests include intelligent software engineering and software defect prediction.
    XU Tao,born in 1978,postgraduate,senior experimenter.His main research interests include neural networks,distributed storage,and software defect prediction.
  • Supported by:
    National Natural Science Foundation of China(62202223),Natural Science Foundation of Jiangsu Province,China(BK20220881),Open Project of the Key Laboratory of Security Critical Software of the Ministry of Industry and Information Technology(Nanjing University of Aeronautics and Astronautics)(NJ2022027).

摘要: 在软件开发过程中,及时识别和处理高风险缺陷模块是至关重要的。传统的软件缺陷预测方法主要基于代码相关的信息,但常常忽略了开发者个人特质对软件质量的影响。针对这一问题,提出了一种新型的结合开发者一致性依赖网络的软件缺陷预测模型DCN4SDP。首先利用开发者信息构建了一个开发者一致性依赖网络,并提取代码相关的度量作为网络的初始度量元,通过使用双向门控图神经网络学习网络结构上的节点特征。实验结果表明,DCN4SDP模型在多个标准数据集上的性能显著优于传统机器学习分类器和其他深度学习方法,AUC值达到了0.91,F1值达到了0.76,均显著高于其他对比模型。这些优势表明将开发者维度融入软件缺陷预测能够有效提升模型的预测能力和应用价值,且为未来的软件缺陷预测研究提供了新的思路和方向。

关键词: 软件缺陷预测, 双向门控图神经网络, 开发者信息, 深度学习, 图神经网络, 软件工程

Abstract: In the software development process,timely identification and handling of high-risk defect modules are crucial.Traditional software defect prediction methods primarily rely on code-related information but often overlook the impact of developers' personal characteristics on software quality.To address this issue,this study proposes a novel software defect prediction model,DCN4SDP,which incorporates a developer consistency dependency network.This model first constructs a developer consistency dependency network using developer information and extracts code-related metrics as initial features for the network.It then employs a bidirectional gated graph neural network(BiGGNN) to learn the node features within the network structure.Experimental results demonstrate that the DCN4SDP model significantly outperforms traditional machine learning classifiers and other deep learning methods on multiple standard datasets.For instance,the DCN4SDP achieves an AUC value of 0.91 and a F1 score of 0.76,both notably higher than those of other compared models.These advantages indicate that integrating the developer dimension into software defect prediction can effectively enhance the model's predictive capabilities and practical value,providing new insights and directions for future research in software defect prediction.

Key words: Software defect prediction, Bidirectional gated graph neural network, Developer information, Deep learning, Graph neural network, Software engineering

中图分类号: 

  • TP311
[1]ZAIN Z M,SAKRI S,ISMAIL N H A.Application of deeplearning in software defect prediction:systematic literature review and meta-analysis [J].Information and Software Techno-logy,2023,158:107175.
[2]TIAN X,CHANG J,ZHANG C,et al.Survey of open-sourcesoftware defect prediction method[J].Journal of Computer Research and Development,2023,60(7):1467-1488.
[3]QIU S,HUANG M,LIANG Y,et al.Code multiview hypergraph representation learning for software defect prediction[J].IEEE Transactions on Reliability,2024,73(4):1863-1876.
[4]PHAN A V,NGUYEN M L,BUI L T.Convolutional neural networks over control flow graphs for software defect prediction[C]//Proceedings of the 2017 IEEE 29th International Conference on Tools with Artificial Intelligence(ICTAI).Boston,USA:IEEE,2017:1-8.
[5]ZIMMERMANN T,NAGAPPAN N.Predicting defects using network analysis on dependency graphs[C]//Proceedings of the 30th international conference on Software engineering.Leipzig,Germany,2008:531-540.
[6]OSTRAND T J,WEYUKER E J,BELL R M.Programmer-based fault prediction[C]//Proceedings of the 6th International Conference on Predictive Models in Software Engineering.Timisoara,Romania,2010:1-10.
[7]TANG F,HE P.Software Defect Prediction using Multi-scale Structural Information [C]//Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence.2023:548-556.
[8]MA J,SUN Y Y,HE P,et al.GSAGE2defect:An Improved Approach to Software Defect Prediction based on Inductive Graph Neural Network [C]//Proceedings of the International Conference on Software Engineering and Knowledge Engineering(SEKE).2023:45-50.
[9]ZENG C,ZHOU C Y,LV S K,et al.Gcn2defect:Graph convolutional networks for smotetomek-based software defect prediction[C]//Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering(ISSRE).Wuhan,China:IEEE,2021:69-79.
[10]XU J,WANG F,AI J.Defect prediction with semantics and context features of codes based on graph representation learning[J].IEEE Transactions on Reliability,2020,70(2):613-625.
[11]ZHOU C,HE P,ZENG C,et al.Software defect prediction with semantic and structural information of codes based on graph neural networks[J].Information and Software Technology,2022,152:107057.
[12]CHEN Y,WU L,ZAKI M J.Reinforcement learning basedgraph-to-sequence model for natural question generation [EB/OL].(2019-08-14) [2024-04-19].http://arxiv.org/abs/1908.04942.
[13]YATISH S,JIARPAKDEE J,THONGTANUNAM P,et al.Mining software defects:Should we consider affected releases?[C]//Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering(ICSE).Montreal,Canada:IEEE,2019:654-665.
[14]WANG S,LIU T,NAM J,et al.Deep semantic feature learning for software defect prediction[J].IEEE Transactions on Software Engineering,2018,46(12):1267-1293.
[15]XUAN J,JIANG H,HU Y,et al.Towards effective bug triage with software data reduction techniques[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(1):264-280.
[16]MALHOTRA R.A systematic review of machine learning techniques for software fault prediction[J].Applied Soft Computing,2015,27:504-518.
[17]BREIMAN L.Random Forests[J].Machine Learning,2001,45:5-32.
[18]WANG S,LIU T,TAN L.Automatically learning semantic features for defect prediction[C]//Proceedings of the 38th International Conference on Software Engineering.Austin,USA,2016:297-308.
[19]LESSMANN S,BAESENS B,MUES C,et al.Benchmarkingclassification models for software defect prediction:A proposed framework and novel findings[J].IEEE Transactions on Software Engineering,2008,34(4):485-496.
[20]RUFIBACH K.Use of Brier score to assess binary predictions[J].Journal of Clinical Epidemiology,2010,63(8):938-939.
[21]WOOLSON R F.Wilcoxon signed-rank test[J/OL]. https://doi.org/10.1002/0470011815.b2a15177.
[22]MACBETH G,RAZUMIEJCZYK E,LEDESMA R D.Cliff'sdelta calculator:a nonparametric effect size program for two groups of observations[J].Universitas Psychologica,2011,10(2):545-555.
[23]ARMSTRONG R A.When to use the Bonferroni correction[J].Ophthalmic and Physiological Optics,2014,34(5):502-508.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!