计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 149-158.doi: 10.11896/jsjkx.210100200
张健雄1, 宋坤1, 何鹏1,2, 李兵3
ZHANG Jian-xiong1, SONG Kun1, HE Peng1,2, LI Bing3
摘要: 软件系统中通常存在一些在拓扑结构上处于核心位置的关键类,这些类上的缺陷往往会给系统带来极大的安全隐患,识别关键类对工程师理解或维护一个软件系统至关重要。针对这一问题,提出一种基于图神经网络的关键类识别方法。首先利用复杂网络理论,将软件系统抽象为软件网络;其次结合无监督网络节点嵌入学习以及邻域聚合的方式,构建一个编码-解码(encoder-decoder)框架,提取软件系统中类节点的表征向量;最后利用Pairwise排序学习实现网络中节点的重要性排序,从而实现软件系统中关键类的识别。为验证所提方法的有效性,选取4个Java开源软件作为实验对象,并与常用的5种节点重要性度量方法以及2个已有工作进行对比分析。实验结果表明:与介数中心性、K-core、接近中心性、节点收缩法和PageRank等方法相比,该方法识别关键类的效果更好;另外,相比已有工作,在前15%的关键类节点中,所提方法的召回率和准确率的提高幅度均在10%以上。
中图分类号:
[1]PAN W,LI B,MA Y,et al.Measuring structural quality of object-oriented softwares via bug propagation analysis on weighted software networks[J].Journal of Computer Science and Technology,2010,25(6):1202-1213. [2]MAGGIE H,KATERINA G P.Common trends in software fault and failure data[J].IEEE Transactions on Software Engineering,2009,35(4):484-496. [3]DAI J Y,WANG B,SHENG J F,et al.Identifying influential nodes in complex networks based on local neighbor contribution[J].IEEE Access,2019,DOI:10.1109/ACCESS.2019.2939804. [4]RUAN Y R,LAO S Y,WANG J D,et al.Node importance measurement based on neighborhood similarity in complex network[J].Journal of Physics,2017,66(3):1-8. [5]TAN Y J,WU J,DENG H Z.Evaluation Method for Node Importance base on Node Contractionin Complex Networks[J].Systems Engineering Theory and Practice,2006,26(11):79-83. [6]BIAN T,HU J T,DENG Y.Identifying influential nodes in complex networks based on AHP[J].Physica A Statistical Mechanics and its Applications,2017,479:422-436. [7]WATTS D J,STROGATZ S H.Collective dynamics of ‘small-world' networks[J].Nature,1998,393:440-442. [8]BARABÁSI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286:509-512. [9]CONCAS G,LOCCI M F,PINNA S,et al.Fractal dimension in software networks[J].Europhysics Letters,2006,76(6):1221-1227. [10]WANG X F,LI X,CHEN G R.Complex Network Theory and Its Application[M].Beijing:Tsinghua University Press,2006. [11]HE K Q,MA Y T,LIU J,et al.Software Network[M].Beijing:Science Press,2008. [12]MYERS C R.Software systems as complex networks:Struc- ture,function,and evolvability of software collaboration graphs[J].Phys. Rev. E,2003,68:46116. [13]PAN W,LI B,MA Y,et al.Class structure refactoring of object-oriented softwares using community detection in dependency[J].Networks Frontiers of Computer Science in China,2009,3(3):396-404. [14]VALVERDE S,CANCHO R F,SOLÉ R V.Scale free networks from optimal design[J].Europhysics Letters,2002,60(4):512-517. [15]OPSAHL T,AGNEESSENS F,SKVORETZ J.Node centrality in weighted networks:generalizing degree and shortest paths[J].Social Networks,2010,32(3):245-251. [16]KITSAK M,GALLOS L K,HAVLIN S,et al.Identification of influential spreaders in complex networks[J].Nature Physics,2010,6(11):888-893. [17]LI P X,RENG Y Q,XI Y M.An Importance Measure of Actors(Set) within a Network[J].Systems Engineering,2004,22:13-20. [18]BRIN S,PAGE L.The anatomy of a large-scale hypertextual web search engine[J].Computer Networks,2012,56(18):3825-3833. [19]KLEINBERG J M.Authoritative sources in a hyperlinked environment[J].Journal of the ACM,1999,46(5):604-632. [20]WANG M,PAN W.A comparative study of network centrality metrics in identifying key classes in software[J].Journal of Computational Information Systems,2012,8(24):10205-10212. [21]PERIN F,RENGGLI L,RESSIA J.Ranking software artifacts[C]//Proceedings of 4th Workshop on FAMIX and Moose in Software Reengineering(FAMOO-Sr'10).2010:1-4. [22]WANG M S,LU H M.Identifying Key Classes Using h-Index and its Variants[J].Computer Science and Exploration,2011,5(10):891-903. [23]HU S W,LI B,HE P,et al.Approach Based h-index to Measu-ring the Important Classes in Software Network[J].Small Microcomputer System,2017(2):249-253. [24]WANG Y,YU H,ZHU Z L.A Class Integration Test Order Method Based on the Node Importance of Software[J].Compu-ter Research and Development,2016,53(3):17-30. [25]LI D W,LI B,HE P,et al.Ranking the importance of classes via software structural analysis[J].Future Communication,Computing,Control and Management,2012,141:441-449. [26]JIANG S J,JU X L,WANG X Y,et al.Measuring the Importance of Classes Using UIO Sequence[J].Electronic Journals,2015,43(10):2062-2068. [27]WANG J,AI J,YANG Y,et al.Identifying key classes of object-oriented software based on software complex network[C]//International Conference on System Reliability & Safety.IEEE,2017:444-449. [28]SRINIVASAN S M,SANGWAN R S,NEILL C J.On the measures for ranking software components[J].Innovations in Systems and Software Engineering,2017,13:161-175. [29]PAN W F,MING H,CARL K,et al.ElementRank:Ranking Java software classes and packages using multilayer complex network-based approach[C]//IEEE Transactions on Software Engineering.2019. [30]PAN W F,SONG B B,LI K S,et al.Identifying key classes in object-oriented software using generalized k-core decomposition[J].Future Generation Computer Systems,2018,81:188-202. [31]GORI M,MONFARDINI G,SCARSELLI F.A new model for learning in graph domains[C]//Proc. of IJCNN. IEEE,2005:729-734. [32]SCARSELLI F,GORI M,TSOI A C,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2009,20(1):61-80. [33]GALLICCHIO C,MICHELI A.Graph echo state networks [C]//IJCNN.IEEE,2010:1-8. [34]SIMONOVSKY M,KOMODAKIS N.Graphvae:Towards gene-ration of small graphs using variational autoencoders[C]//International Conference on Artificial Neural Networks.2018:412-422. [35]BOJCHEVSKI A,SHCHUR O,ZUGNER D,et al.Netgan: Generating graphs via random walks[C]//Proceedings of the 35th Int. Conference on Machine Learning.2018:610-619. [36]LIU T Y.Learning to Rank for Information Retrieval[M].Berlin:Springer,2011. [37]SHI Z,KEUNG J,BENNIN K E,et al.Comparing learning to rank techniques in hybrid bug localization[J].Applied Soft Computing,2018,62:636-648. [38]BURGES C,SHAKED T,RENSHAW E,et al.Learning to rank using gradient descent[C]//Proceedings of the 22Nd International Conference on Machine Learning.Bonn,Germany,2005:89-96. [39]FREUND Y,IYER R,SCHAPIRE R E,et al.An efficient boosting algorithm for combining preferences[J].Journal of Machine Learning Research,2004,4(6):170-178. [40]WU Q,BURGES C C,SVORE K,et al.Adapting boosting for information retrieval measures[J].Inf. Retrieval.2010,13(3):254-270. [41]KIPF T N,WELLING M.Semi-supervised classification with graph convolutional networks[C]//Proc. of International Conference on Learning Representations.Toulon,France,2017,arXiv:1609.02907v4. [42]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//31st Conference on Neural Information Processing Systems(NIPS 2017).Long Beach,CA,USA,2017:1-11. [43]LI Y,TARLOW D,BROCKSCHMIDT M,et al.Gated graph sequence neural networks[J].arXiv:1511.05493,2015. [44]XU K,LI C T,TIAN Y L,et al.Representation learning on graphs with jumping knowledge networks[C]//Proceedings of the 35th International Conference on Machine Learning.Stockholm,Sweden,2018:5449-5458. [45]ZAIDMAN A,DEMEYER S.Automatic identification of key classes in a software system using web mining techniques[J].Journal of Software Maintenance and Evolution:Research and Practice,2008,20(6):387-417. [46]IYER S,KILLINGBACK T,SUNDARAM B,et al.Attack robustness and centrality of complex networks[J].PLoS ONE,2013,8(4):e59613. [47]FAN C,ZENG L,DING Y,et al.Learning to identify high betweenness centrality nodes from scratch:a novel graph neural network approach[J].arXiv:1905.10418v1. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[3] | 齐秀秀, 王佳昊, 李文雄, 周帆. 基于概率元学习的矩阵补全预测融合算法 Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning 计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126 |
[4] | 杨炳新, 郭艳蓉, 郝世杰, 洪日昌. 基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用 Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition 计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070 |
[5] | 熊中敏, 舒贵文, 郭怀宇. 融合用户偏好的图神经网络推荐模型 Graph Neural Network Recommendation Model Integrating User Preferences 计算机科学, 2022, 49(6): 165-171. https://doi.org/10.11896/jsjkx.210400276 |
[6] | 邓朝阳, 仲国强, 王栋. 基于注意力门控图神经网络的文本分类 Text Classification Based on Attention Gated Graph Neural Network 计算机科学, 2022, 49(6): 326-334. https://doi.org/10.11896/jsjkx.210400218 |
[7] | 余皑欣, 冯秀芳, 孙静宇. 结合物品相似性的社交信任推荐算法 Social Trust Recommendation Algorithm Combining Item Similarity 计算机科学, 2022, 49(5): 144-151. https://doi.org/10.11896/jsjkx.210300217 |
[8] | 李勇, 吴京鹏, 张钟颖, 张强. 融合快速注意力机制的节点无特征网络链路预测算法 Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism 计算机科学, 2022, 49(4): 43-48. https://doi.org/10.11896/jsjkx.210800276 |
[9] | 曹合心, 赵亮, 李雪峰. 图神经网络在Text-to-SQL解析中的技术研究 Technical Research of Graph Neural Network for Text-to-SQL Parsing 计算机科学, 2022, 49(4): 110-115. https://doi.org/10.11896/jsjkx.210200173 |
[10] | 苗旭鹏, 周跃, 邵蓥侠, 崔斌. GSO:基于图神经网络的深度学习计算图子图替换优化框架 GSO:A GNN-based Deep Learning Computation Graph Substitutions Optimization Framework 计算机科学, 2022, 49(3): 86-91. https://doi.org/10.11896/jsjkx.210700199 |
[11] | 陈世聪, 袁得嵛, 黄淑华, 杨明. 基于结构深度网络嵌入模型的节点标签分类算法 Node Label Classification Algorithm Based on Structural Depth Network Embedding Model 计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177 |
[12] | 郭磊, 马廷淮. 基于好友亲密度的用户匹配 Friend Closeness Based User Matching 计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137 |
[13] | 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞. 基于节点相似性和网络嵌入的复杂网络社区发现算法 Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding 计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009 |
[14] | 郑苏苏, 关东海, 袁伟伟. 融合不完整多视图的异质信息网络嵌入方法 Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion 计算机科学, 2021, 48(9): 68-76. https://doi.org/10.11896/jsjkx.210500203 |
[15] | 张人之, 朱焱. 基于主动学习的社交网络恶意用户检测方法 Malicious User Detection Method for Social Network Based on Active Learning 计算机科学, 2021, 48(6): 332-337. https://doi.org/10.11896/jsjkx.200700151 |
|