计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 81-88.doi: 10.11896/jsjkx.211100040

• 计算机软件 • 上一篇    下一篇

基于开发者多元特征的软件缺陷自动分派方法

董夏磊1, 项正龙2, 吴泓润3, 汪鼎文1, 李元香1   

  1. 1 武汉大学计算机学院 武汉430072
    2 南京信息工程大学计算机与软件学院 南京210044
    3 闽南师范大学物理与信息工程学院 福建 漳州363000
  • 收稿日期:2021-11-03 修回日期:2022-04-23 发布日期:2022-12-14
  • 通讯作者: 吴泓润(dr.hongrunwu@gmail.com)
  • 作者简介:(xldong_whu@qq.com)
  • 基金资助:
    国家自然科学基金(62106092,61672391)

Automatic Assignment Method for Software Bug Based on Multivariate Features of Developers

DONG Xia-lei1, XIANG Zheng-long2, WU Hong-run3, WANG Ding-wen1, LI Yuan-xiang1   

  1. 1 School of Computer Science,Wuhan University,Wuhan 430072,China
    2 School of Computer and Software,Nanjing University of Information Science and Technology,Nanjing 210044,China
    3 School of Physics and Information Engineering,Minnan Normal University,Zhangzhou,Fujian 363000,China
  • Received:2021-11-03 Revised:2022-04-23 Published:2022-12-14
  • About author:DONG Xia-lei,born in 1998,postgra-duate.His main research interests include deep learning and intelligent software.WU Hong-run,born in 1989,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include complex networks and graph neural network.
  • Supported by:
    National Natural Science Foundation of China(62106092,61672391).

摘要: 软件缺陷修复是软件生命过程中一个不可忽视的问题,如何高效地进行软件缺陷的自动分派是一个十分重要的研究方向。目前已有的研究方法多侧重于缺陷报告的文本内容或开发者抛掷网络中的浅层信息,而忽视了开发者抛掷网络中的高层次拓扑信息。为此,提出了一个基于开发者多元特征的软件缺陷自动分派模型MFD-GCN。该模型充分考虑开发者抛掷网络中的高层拓扑特征,并运用图卷积网络强大的网络特征提取能力,充分挖掘出代表开发者深层合作关系和修复偏好性的多元特征,并与缺陷报告文本特征一起训练分类器。模型在两个大型开源软件项目Eclipse和Mozilla上进行实验,实验结果表明,相比近年来提出的主流分派方法,MFD-GCN模型在推荐前K个开发者时均取得了较好的推荐结果,其中,在Eclipse项目上Top-1推荐准确率达到了69.8%,在Mozilla项目上达到了59.7%。

关键词: 自动分派, 缺陷报告, 开发者抛掷网络, 图卷积网络, 多元特征

Abstract: Software bug repair is a problem that cannot be ignored in the process of software life.How to efficiently assign software bugs automatically is a very important research direction.Now,the existing research methods mainly focus on the bugreport’s text content or the low-level information of the developers’ tossing network,while ignoring the high-level topology information in the tossing network.Therefore,this paper proposes a software bug automatic assignment model MFD-GCN based on the developers’ multivariate features.Model fully considers the high-level topological features in the developers’ tossing network,and uses the powerful network feature extraction capabilities of graph convolution network to fully mine the multivariate features that represent developers’ deep cooperation relationship and fixing preferences,and train the classifier together with the bug text features.The proposed method is evaluated on two large open-source software projects,i.e.,Eclipse and Mozilla.Expe-rimental results show that compared with the mainstream bug-assignment methods proposed in recent years,the MFD-GCN mo-del has achieved state-of-art results in recommending the top K developers.The accuracy rate of top-1 recommendation on the Eclipse and Mozilla project reaches 69.8% and 59.7%,respectively.

Key words: Automatic assignment, Bug report, Developers’ tossing network, Graph convolution network, Multivariate feature

中图分类号: 

  • TP311.5
[1]ANVIK J.Automating Bug Report Assignment[C]//Procee-dings of the 28th International Conference on Software Enginee-ring.New York:ACM Press,2006:937-940.
[2]ANVIK J,HIEW L,MURPHY G C.Who Should Fix ThisBug? [C]//Proceedings of the 28th International Conference on Software Engineering.New York:ACM Press,2006:361-370.
[3]Inc.STATISTA.Projected Revenue of Open Source Software from 2008 to 2020[EB/OL].https://www.statista.com/statistics/270805/projected-revenue-of-open-source-software-since-2008/.
[4]JEONG G,KIM S,ZIMMERMANN T.Improving Bug Tria-ge with Bug Tossing Graphs [C]//Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering.New York:ACM Press,2009:111-120.
[5]XUAN J F,JIANG H,HU Y,et al.Towards Effective Bug Tria-ge with Software Data Reduction Techniques[J].IEEE Transa-ctions on Knowledge and Data Engineering,2014,27(1):264-280.
[6]JONSSON L,BORG M,BROMAN D,et al.Automated BugAssignment:Ensemble-based Machine Learning in Large Scale Industrial Contexts[J].Empirical Software Engineering,2016,21(4):1533-1578.
[7]SHI X W,MA Y T.Software Bug Triaging Method Based onText Classification and Developer Rating[J].Computer Science,2018,45(11):193-198.
[8]WANG S,ZHANG W,YANG Y,et al.DevNet:Exploring Developer Collaboration in Heterogen-eous Networks of Bug Repositories[C]//Processing of the 7th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement.New York:IEEE Press,2013:193-202.
[9]WU W,ZHANG W,YANG Y,et al.DREX:Developer Recommendation with K-Nearest Neighbor Search and Expertise Ranking[C]//Processing of the 18th Asia-Pacific Software Engineering Conference.New York:IEEE Press,2011:389-396.
[10]SHI G X,ZHAO F Y.Software Defect Assignment MethodBased on Defect Similarity and Tossing Graph[J].Computer Science,2016,43(11):246-251.
[11]BHATTACHARYA P,NEAMTIU I.Fine-grained Incremental Learning and Multi-feature Tossing Graphs to Improve Bug Tria-ging[C]//2010 IEEE International Conference on Software Maintenance.New York:IEEE Press,2010:1-10.
[12]BHATTACHARYA P,NEAMTIU I,SHELTON C R.Automated,Highly-accurate,Bug Assignment Using Machine Lear-ning and Tossing Graphs[J].Journal of Systems and Software,2012,85(10):2275-2292.
[13]SONG H Z,MA Y T.DeepTriage:An Automatic Triage Me-thod for Software Bugs Using Deep Learning[J].Journal of Chinese Computer Systems,2019,40(1):128-134.
[14]GUO S,ZHANG X,YANG X,et al.Developer Activity Motivated Bug Triaging:via Convolutional Neural Network[J].Neural Processing Letters,2020,51(3):2589-2606.
[15]MANI S,SANKARAN A,ARALIKATTE R.DeepTriage:Exploring the Effectiveness of Deep Learning for Bug Triaging[C]//Proceedings of the ACM India Joint International Confe-rence on Data Science and Management of Data.New York:ACM Press,2019:171-179.
[16]ZAIDI S F A,LEE C G.Learning Graph Representation of Bug Reports to Triage Bugs using Graph Convolution Network[C]//2021 International Conference on Information Networking(ICOIN).New York:IEEE Press,2021:504-507.
[17]LI Y X,DONG X L,XIANG Z L,et al.A Graph Convolutional Neural Network Based Approach for Software Bug Triage[J].Journal of Wuhan University(Natural Science Edition),2020,66(3):244-252.
[18]LAMKANFI A,PEREZ J,DEMEYER S.The Eclipse andMozilla Defect Tracking Dataset:a Genuine Dataset for Mining Bug Information [C]//the 10th Working Conference on Mining Software Repositories(MSR).New York:IEEE Press,2013:203-206.
[19]XI S Q,YAO Y,XIAO X S,et al.Bug Triaging Based on Tos-sing Sequence Modeling[J].Journal of Computer Science and Technology,2019,34(5):942-956.
[20]ZHANG J,WANG X,HAO D,et al.A Survey on Bug-reportAnalysis[J].Science China Information Sciences,2015,58(2):1-24.
[21]WU H,LIU H,MA Y.Empirical Study on Developer Factors Affecting Tossing Path Length of Bug Reports[J].IET Software,2018,12(3):258-270.
[22]BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
[23]KIPF T N,WELLING M.Semi-supervised Classification withGraph Convolutional Networks [J].arXiv:1609.02907,2016.
[24]YING R,HE R,CHEN K,et al.Graph Convolutional NeuralNetworks for Web-scale Recommender Systems[C]//Procee-dings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.New York:ACM Press,2018:974-983.
[25]ABU-EL-HAIJA S,KAPOOR A,PEROZZI B,et al.N-GCN:Multi-scale Graph Convolution for Semi-supervised Node Classification[C]//Proceedings of Machine Learning Research.MLR Press,2020:841-851.
[1] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[2] 李健智, 王红玲, 王中卿.
基于图卷积网络的专利摘要自动生成研究
Automatic Generation of Patent Summarization Based on Graph Convolution Network
计算机科学, 2022, 49(6A): 172-177. https://doi.org/10.11896/jsjkx.210400117
[3] 赵小虎, 叶圣, 李晓.
多算法融合的骨骼重建信息动作分类方法
Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction
计算机科学, 2022, 49(6): 269-275. https://doi.org/10.11896/jsjkx.210500070
[4] 周海榆, 张道强.
面向多中心数据的超图卷积神经网络及应用
Multi-site Hyper-graph Convolutional Neural Networks and Application
计算机科学, 2022, 49(3): 129-133. https://doi.org/10.11896/jsjkx.201100152
[5] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[6] 解宇, 杨瑞玲, 刘公绪, 李德玉, 王文剑.
基于动态拓扑图的人体骨架动作识别算法
Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph
计算机科学, 2022, 49(2): 62-68. https://doi.org/10.11896/jsjkx.210900059
[7] 王晓明, 温旭云, 徐梦婷, 张道强.
一种面向脑疾病诊断的图卷积网络对抗攻击方法
Graph Convolutional Network Adversarial Attack Method for Brain Disease Diagnosis
计算机科学, 2022, 49(12): 340-345. https://doi.org/10.11896/jsjkx.220500185
[8] 张斌, 刘长红, 曾胜, 揭安全.
基于时空图卷积网络的语音驱动个人风格手势生成方法
Speech-driven Personal Style Gesture Generation Method Based on Spatio-Temporal GraphConvolutional Networks
计算机科学, 2022, 49(11A): 210900094-5. https://doi.org/10.11896/jsjkx.210900094
[9] 倪珍, 李斌, 孙小兵, 李必信, 朱程.
面向软件缺陷报告的缺陷定位方法研究与进展
Research and Progress on Bug Report-oriented Bug Localization Techniques
计算机科学, 2022, 49(11): 8-23. https://doi.org/10.11896/jsjkx.220200117
[10] 肖正业, 林世铨, 万修安, 方昱春, 倪兰.
基于时序信息对齐的连续手语跨模态知识蒸馏
Temporal Relation Guided Knowledge Distillation for Continuous Sign Language Recognition
计算机科学, 2022, 49(11): 156-162. https://doi.org/10.11896/jsjkx.220600036
[11] 宋龙泽, 万怀宇, 郭晟楠, 林友芳.
面向出租车空载时间预测的多任务时空图卷积网络
Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction
计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089
[12] 程思伟, 葛唯益, 王羽, 徐建.
BGCN:基于BERT和图卷积网络的触发词检测
BGCN:Trigger Detection Based on BERT and Graph Convolution Network
计算机科学, 2021, 48(7): 292-298. https://doi.org/10.11896/jsjkx.200500133
[13] 宋元隆, 吕光宏, 王桂芝, 贾吾财.
基于图卷积神经网络的SDN网络流量预测
SDN Traffic Prediction Based on Graph Convolutional Network
计算机科学, 2021, 48(6A): 392-397. https://doi.org/10.11896/jsjkx.200800090
[14] 余笙, 李斌, 孙小兵, 薄莉莉, 周澄.
知识驱动的相似缺陷报告推荐方法
Approach for Knowledge-driven Similar Bug Report Recommendation
计算机科学, 2021, 48(5): 91-98. https://doi.org/10.11896/jsjkx.200600159
[15] 吕明琪, 洪照雄, 陈铁明.
一种融合时空关联与社会事件的交通流预测方法
Traffic Flow Forecasting Method Combining Spatio-Temporal Correlations and Social Events
计算机科学, 2021, 48(2): 264-270. https://doi.org/10.11896/jsjkx.200300098
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!