Computer Science ›› 2021, Vol. 48 ›› Issue (12): 149-158.doi: 10.11896/jsjkx.210100200

• Computer Software • Previous Articles     Next Articles

Identification of Key Classes in Software Systems Based on Graph Neural Networks

ZHANG Jian-xiong1, SONG Kun1, HE Peng1,2, LI Bing3   

  1. 1 School of Computer and Information Engineering,Hubei University,Wuhan 430062,China
    2 Hubei Provincial Key Laboratory of Applied Mathematics,Wuhan 430062,China
    3 School of Computer Science,Wuhan University,Wuhan 430072,China
  • Received:2021-01-26 Revised:2021-05-09 Online:2021-12-15 Published:2021-11-26
  • About author:ZHANG Jian-xiong,born in 1998,postgraduate.His main research interests include network embedding,neural network and software network.
    HE Peng,born in 1988,Ph.D,associate professor,postgraduate supervisor,is a member of China Computer Federation.His main research interests include software engineering and complex networks.
  • Supported by:
    National Key R & D Program of China(2018YFB1003801),National Natural Science Foundation of China(61832014,61902114,61977021),Science and Technology Innovation Program of Hubei Province(2019ACA144) and Open Foundation of Hubei Key Laboratory of Applied Mathematics(HBAM201901).

Abstract: There are usually some key classes which are in the core position in the topology structure of software systems.The defects in these classes will bring great security risks to the system.Therefore,it is very important to identify these key classes for engineers to understand or maintain an unfamiliar software system.To do this,the paper proposes a novel method of identifying key classes based on graph neural networks.Specifically,the software system is abstracted as software network by using complex network theory,and then by combining unsupervised network embedding learning and neighborhood aggregation mode,we construct an encoder-decoder framework to extract the representation vector of class nodes in software system.Finally,according to the obtained node representations,Pairwise learning-to-rank algorithm is adopted to realize the importance ranking of nodes,so as to achieve the identification of key classes in software system.In order to verify the effectiveness of our method,an empirical analysis of four object-oriented Java open-source software is done,and we compare it with five commonly used node importance measurement methods and two existing works.The experimental results show that,compared with node centrality,K-core and PageRank,the proposed method is more effective in identifying key classes from the perspective of network robustness.In addition,on the existing public labeled dataset,the recall and precision of this paper are better at the top 15% percent of nodes,and improved by more than 10%.

Key words: Graph neural network, Key class identification, Learning to rank, Network embedding, Software network

CLC Number: 

  • TP311.53
[1]PAN W,LI B,MA Y,et al.Measuring structural quality of object-oriented softwares via bug propagation analysis on weighted software networks[J].Journal of Computer Science and Technology,2010,25(6):1202-1213.
[2]MAGGIE H,KATERINA G P.Common trends in software fault and failure data[J].IEEE Transactions on Software Engineering,2009,35(4):484-496.
[3]DAI J Y,WANG B,SHENG J F,et al.Identifying influential nodes in complex networks based on local neighbor contribution[J].IEEE Access,2019,DOI:10.1109/ACCESS.2019.2939804.
[4]RUAN Y R,LAO S Y,WANG J D,et al.Node importance measurement based on neighborhood similarity in complex network[J].Journal of Physics,2017,66(3):1-8.
[5]TAN Y J,WU J,DENG H Z.Evaluation Method for Node Importance base on Node Contractionin Complex Networks[J].Systems Engineering Theory and Practice,2006,26(11):79-83.
[6]BIAN T,HU J T,DENG Y.Identifying influential nodes in complex networks based on AHP[J].Physica A Statistical Mechanics and its Applications,2017,479:422-436.
[7]WATTS D J,STROGATZ S H.Collective dynamics of ‘small-world' networks[J].Nature,1998,393:440-442.
[8]BARABÁSI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286:509-512.
[9]CONCAS G,LOCCI M F,PINNA S,et al.Fractal dimension in software networks[J].Europhysics Letters,2006,76(6):1221-1227.
[10]WANG X F,LI X,CHEN G R.Complex Network Theory and Its Application[M].Beijing:Tsinghua University Press,2006.
[11]HE K Q,MA Y T,LIU J,et al.Software Network[M].Beijing:Science Press,2008.
[12]MYERS C R.Software systems as complex networks:Struc- ture,function,and evolvability of software collaboration graphs[J].Phys. Rev. E,2003,68:46116.
[13]PAN W,LI B,MA Y,et al.Class structure refactoring of object-oriented softwares using community detection in dependency[J].Networks Frontiers of Computer Science in China,2009,3(3):396-404.
[14]VALVERDE S,CANCHO R F,SOLÉ R V.Scale free networks from optimal design[J].Europhysics Letters,2002,60(4):512-517.
[15]OPSAHL T,AGNEESSENS F,SKVORETZ J.Node centrality in weighted networks:generalizing degree and shortest paths[J].Social Networks,2010,32(3):245-251.
[16]KITSAK M,GALLOS L K,HAVLIN S,et al.Identification of influential spreaders in complex networks[J].Nature Physics,2010,6(11):888-893.
[17]LI P X,RENG Y Q,XI Y M.An Importance Measure of Actors(Set) within a Network[J].Systems Engineering,2004,22:13-20.
[18]BRIN S,PAGE L.The anatomy of a large-scale hypertextual web search engine[J].Computer Networks,2012,56(18):3825-3833.
[19]KLEINBERG J M.Authoritative sources in a hyperlinked environment[J].Journal of the ACM,1999,46(5):604-632.
[20]WANG M,PAN W.A comparative study of network centrality metrics in identifying key classes in software[J].Journal of Computational Information Systems,2012,8(24):10205-10212.
[21]PERIN F,RENGGLI L,RESSIA J.Ranking software artifacts[C]//Proceedings of 4th Workshop on FAMIX and Moose in Software Reengineering(FAMOO-Sr'10).2010:1-4.
[22]WANG M S,LU H M.Identifying Key Classes Using h-Index and its Variants[J].Computer Science and Exploration,2011,5(10):891-903.
[23]HU S W,LI B,HE P,et al.Approach Based h-index to Measu-ring the Important Classes in Software Network[J].Small Microcomputer System,2017(2):249-253.
[24]WANG Y,YU H,ZHU Z L.A Class Integration Test Order Method Based on the Node Importance of Software[J].Compu-ter Research and Development,2016,53(3):17-30.
[25]LI D W,LI B,HE P,et al.Ranking the importance of classes via software structural analysis[J].Future Communication,Computing,Control and Management,2012,141:441-449.
[26]JIANG S J,JU X L,WANG X Y,et al.Measuring the Importance of Classes Using UIO Sequence[J].Electronic Journals,2015,43(10):2062-2068.
[27]WANG J,AI J,YANG Y,et al.Identifying key classes of object-oriented software based on software complex network[C]//International Conference on System Reliability & Safety.IEEE,2017:444-449.
[28]SRINIVASAN S M,SANGWAN R S,NEILL C J.On the measures for ranking software components[J].Innovations in Systems and Software Engineering,2017,13:161-175.
[29]PAN W F,MING H,CARL K,et al.ElementRank:Ranking Java software classes and packages using multilayer complex network-based approach[C]//IEEE Transactions on Software Engineering.2019.
[30]PAN W F,SONG B B,LI K S,et al.Identifying key classes in object-oriented software using generalized k-core decomposition[J].Future Generation Computer Systems,2018,81:188-202.
[31]GORI M,MONFARDINI G,SCARSELLI F.A new model for learning in graph domains[C]//Proc. of IJCNN. IEEE,2005:729-734.
[32]SCARSELLI F,GORI M,TSOI A C,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2009,20(1):61-80.
[33]GALLICCHIO C,MICHELI A.Graph echo state networks [C]//IJCNN.IEEE,2010:1-8.
[34]SIMONOVSKY M,KOMODAKIS N.Graphvae:Towards gene-ration of small graphs using variational autoencoders[C]//International Conference on Artificial Neural Networks.2018:412-422.
[35]BOJCHEVSKI A,SHCHUR O,ZUGNER D,et al.Netgan: Generating graphs via random walks[C]//Proceedings of the 35th Int. Conference on Machine Learning.2018:610-619.
[36]LIU T Y.Learning to Rank for Information Retrieval[M].Berlin:Springer,2011.
[37]SHI Z,KEUNG J,BENNIN K E,et al.Comparing learning to rank techniques in hybrid bug localization[J].Applied Soft Computing,2018,62:636-648.
[38]BURGES C,SHAKED T,RENSHAW E,et al.Learning to rank using gradient descent[C]//Proceedings of the 22Nd International Conference on Machine Learning.Bonn,Germany,2005:89-96.
[39]FREUND Y,IYER R,SCHAPIRE R E,et al.An efficient boosting algorithm for combining preferences[J].Journal of Machine Learning Research,2004,4(6):170-178.
[40]WU Q,BURGES C C,SVORE K,et al.Adapting boosting for information retrieval measures[J].Inf. Retrieval.2010,13(3):254-270.
[41]KIPF T N,WELLING M.Semi-supervised classification with graph convolutional networks[C]//Proc. of International Conference on Learning Representations.Toulon,France,2017,arXiv:1609.02907v4.
[42]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//31st Conference on Neural Information Processing Systems(NIPS 2017).Long Beach,CA,USA,2017:1-11.
[43]LI Y,TARLOW D,BROCKSCHMIDT M,et al.Gated graph sequence neural networks[J].arXiv:1511.05493,2015.
[44]XU K,LI C T,TIAN Y L,et al.Representation learning on graphs with jumping knowledge networks[C]//Proceedings of the 35th International Conference on Machine Learning.Stockholm,Sweden,2018:5449-5458.
[45]ZAIDMAN A,DEMEYER S.Automatic identification of key classes in a software system using web mining techniques[J].Journal of Software Maintenance and Evolution:Research and Practice,2008,20(6):387-417.
[46]IYER S,KILLINGBACK T,SUNDARAM B,et al.Attack robustness and centrality of complex networks[J].PLoS ONE,2013,8(4):e59613.
[47]FAN C,ZENG L,DING Y,et al.Learning to identify high betweenness centrality nodes from scratch:a novel graph neural network approach[J].arXiv:1905.10418v1.
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[3] QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[4] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[5] XIONG Zhong-min, SHU Gui-wen, GUO Huai-yu. Graph Neural Network Recommendation Model Integrating User Preferences [J]. Computer Science, 2022, 49(6): 165-171.
[6] DENG Zhao-yang, ZHONG Guo-qiang, WANG Dong. Text Classification Based on Attention Gated Graph Neural Network [J]. Computer Science, 2022, 49(6): 326-334.
[7] YU Ai-xin, FENG Xiu-fang, SUN Jing-yu. Social Trust Recommendation Algorithm Combining Item Similarity [J]. Computer Science, 2022, 49(5): 144-151.
[8] LI Yong, WU Jing-peng, ZHANG Zhong-ying, ZHANG Qiang. Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism [J]. Computer Science, 2022, 49(4): 43-48.
[9] CAO He-xin, ZHAO Liang, LI Xue-feng. Technical Research of Graph Neural Network for Text-to-SQL Parsing [J]. Computer Science, 2022, 49(4): 110-115.
[10] MIAO Xu-peng, ZHOU Yue, SHAO Ying-xia, CUI Bin. GSO:A GNN-based Deep Learning Computation Graph Substitutions Optimization Framework [J]. Computer Science, 2022, 49(3): 86-91.
[11] CHEN Shi-cong, YUAN De-yu, HUANG Shu-hua, YANG Ming. Node Label Classification Algorithm Based on Structural Depth Network Embedding Model [J]. Computer Science, 2022, 49(3): 105-112.
[12] GUO Lei, MA Ting-huai. Friend Closeness Based User Matching [J]. Computer Science, 2022, 49(3): 113-120.
[13] YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia. Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding [J]. Computer Science, 2022, 49(3): 121-128.
[14] ZHENG Su-su, GUAN Dong-hai, YUAN Wei-wei. Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion [J]. Computer Science, 2021, 48(9): 68-76.
[15] ZHANG Ren-zhi, ZHU Yan. Malicious User Detection Method for Social Network Based on Active Learning [J]. Computer Science, 2021, 48(6): 332-337.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!