Computer Science ›› 2021, Vol. 48 ›› Issue (12): 59-66.doi: 10.11896/jsjkx.210100077

• Computer Software • Previous Articles     Next Articles

Approach of God Class Detection Based on Evolutionary and Semantic Features

WANG Ji-wen, WU Yi-jian, PENG Xin   

  1. Software School,Fudan University,Shanghai 200438,China
    Shanghai Key Laboratory of Data Science,Shanghai 200438,China
  • Received:2021-01-10 Revised:2021-03-21 Online:2021-12-15 Published:2021-11-26
  • About author:WANG Ji-wen,born in 1997,postgra-duate.His main research interests include software design analysis and software evolution analysis.
    WU Yi-jian,born in 1979,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include big code analysis,software evolution analysis and code clone detection and management.
  • Supported by:
    National Key R & D Program of China(2017YFB1002000) and Shanghai Science and Technology Development Funds(18DZ1112100,18DZ1112102).

Abstract: With the acceleration of software development iterations,developers often violate the basic principles of software design due to various reasons such as delivery pressure,resulting in code smells and affecting software quality.God class is one of the most common code smells,referring to classes that have taken on too many responsibilities.God class violates the design principle of “high cohesion and low coupling”,damages the quality of the software system,and affects the understandability and maintainability of the code.Therefore,a new method of god class detection is proposed.It extracts the evolutionary and semantic features of the actual project,then merges the evolution and semantic features.Based on the merged features,it re-clusters all the methods for the projects.By analyzing the distribution of the member methods of each class in the actual project in the new clustering result,it calculates the cohesion of the class,and finds the class with low cohesion as the God class detection result.Experiments show that this method is superior to the current mainstream God class detection methods.Compared with traditional mea-surement-based detection methods,the recall and precision rates of the proposed method are increased by more than 20%.Compared with detection methods based on machine learning,although the recall rate of the proposed method is slightly lower,but the precision rate and F1 value are significantly improved.

Key words: Code smell, Cohesion, God class, Software evolution

CLC Number: 

  • TP311.5
[1]FOWLER M.Refactoring:improving the design of existing code[M].Addison-Wesley Longman Publishing Co.Inc.,1999.
[2]CHATZIGEORGIOU A,MANAKOS A.Investigating the evolution of Code Smells in object-oriented systems[J].Innovations in Systems and Software Engineering,2014,10(1):3-18.
[3]HAMZA H,COUNSELL S,HALL T,et al.Code Smell eradication and associated refactoring[M].World Scientific and Engineering Academy and Society(WSEAS),2008.
[4]LANZA M,MARINESCU R,DUCASSE S.Object-oriented Metrics in Practice:Using Software Metrics to Characterize,Evaluate,and Improve the Design of Object-oriented Systems[M].Berlin:Springer-Verlag,2006.
[5]FOWLER M.Trans.Refactoring:Improving the Design of Existing Code(2nd ed)[M].Beijing:Posts and Telecom Press,2015.
[6]LE Q,MIKOLOV T.Distributed representations of sentences and documents[C]//International Conference on Machine Learning.PMLR,2014:1188-1196.
[7]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313:504-507.
[8]JAINA K.Data clustering:50 years beyond K-means[J].Pattern Recognition Letters,2010,31(8):651-666.
[9]ETZKORN L H,GHOLSTON S E,FORTUNE J L,et al.A comparison of cohesion metrics for object-oriented systems[J].Information & Software Technology,2004,46(10):677-687.
[10]PALOMBA F,NUCCI D D,TUFANO M,et al.Landfill:An Open Dataset of Code Smells with Public Evaluation[C]//Mi-ning Software Repositories.IEEE,2015:482-485.
[11]TSANTALIS N,CHATZIGEORGIOU A.Identification of Extract Method Refactoring Opportunities[C]//European Confe-rence on Software Maintenance and Reengineering.IEEE Computer Society,2009:119-128.
[12]REDDY K R,RAO A A.Dependency oriented complexity me- trics to detect rippling related design defects[J].ACM Sigsoft Software Engineering Notes,2009,34(4):1-7.
[13]PALOMBA F,BAVOTA G,PENTA M D,et al.Detecting bad smells in source code using change history information[C]//International Conference on Automated Software Engineering.ACM,2013:268-278.
[14]KHOMH F,VAUCHER S,GUEHENEUC Y,et al.BDTEX:A GQM-based Bayesian approach for the detection of antipatterns[J].J.Syst.Softw.,2011,84(4):559-572.
[15]FONTANA F A,ZANONI M,MARINO A,et al.Code smell detection:Towards a machine learning-based approach[C]//2013 IEEE International Conference on Software Maintenance.IEEE,2013:396-399.
[16]BU Y F,LIU H,LI G J.A God class detection method based on deep learning[J].Journal of Software,2019,30(5):161-176.
[17]ZHANG X F,ZHU C.Empirical study of code smell impact on software evolution[J].Journal of Software,2019,30(5):1422-1437.
[18]WU J,HOLT R,HASSAN A.Exploring software evolution using spectrographs[C]//Proceeding of the 11th Working Conference on Reverse Engineering.IEEE Press,2004:80-89.
[19]WU J,SPITZER C W,HASSAN A E,et al.Evolution spectrographs:Visualizing punctuated change in software evolution[C]//Proceeding of the 7th International Workshop on Principles of Software Evolution.ACM Press,2004:57-66.
[20]GALL H,JAZAYERI M,RIVA C.Visualizing software release histories:The use of color and third dimension[C]//Proceeding of the International Conference on Software Maintenance.IEEE Press,1999:99-108.
[21]LANZA M.The evolution matrix:Recovering software evolu- tion using software visualization techniques[C]//Proceeding of the 1st Workshop on Principles of Software Evolution.New York:ACM Press,2001:37-42.
[22]GÎRBA T,DUCASSE S.Modeling history to analyze software evolution[J].Journal of Software Maintenance and Evolution:Research and Practice,2006,18(3):207-236.
[23]ROBBES R,LANZA M.A change-based approach to software evolution[J].Electronic Notes in Theoretical Computer Science,2007,166:93-109.
[24]KOUROSHFAR E.Studying the effect of co-change dispersion on software quality[C]//International Conference on Software Engineering,2013:1450-1452.
[25]GUO X,XIANG Y,CHEN Q,et al.LDA-based online topic detection using tensor factorization[J].J. Inf. Sci.,2013,39(4):459-469.
[26]DEY A,JENAMANI M,THAKKAR J J.Lexical TF-IDF:An n-gram feature space for cross-domain classification of sentiment reviews[C]//International Conference on Pattern Recognition and Machine Intelligence.Cham:Springer,2017:380-386.
[27]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[28]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.2013:3111-3119.
[29]LAHITANI A R,PERMANASARI A E,SETIAWAN N A. Cosine similarity to determine similarity measure:Study case in online essay assessment[C]//2016 4th International Conference on Cyber and IT Service Management.IEEE,2016:1-6.
[30]PALOMBA F,BAVOTA G,PENTA M,et al.On the diffuseness and the impact on maintainability of code smells:a large scale empirical investigation[J].Empir Software Eng.,2018,23:1188-1221.
[31]FOKAEFS M,TSANTALIS N,STROULIA E,et al.JDeodo- rant:identification and application of extract class refactorings[C]//2011 33rd International Conference on Software Enginee-ring(ICSE).IEEE,2011:1037-1039.
[1] ZHANG Jiu-jie, CHEN Chao, NIE Hong-xuan, XIA Yu-qin, ZHANG Li-ping, MA Zhan-fei. Empirical Study on Stability of Clone Code Sets Based on Class Granularity [J]. Computer Science, 2021, 48(5): 75-85.
[2] HE Peng, YU Lv-jun. Analysis of Open Source Software Cliff Walls for Group Collaborative Development [J]. Computer Science, 2020, 47(6): 51-58.
[3] ZHANG Jing-xuan, JIANG He. Research Status and Development Trend of Identifier Normalization [J]. Computer Science, 2020, 47(3): 1-4.
[4] MENG Fan-yi, WANG Ying, YU Hai, ZHU Zhi-liang. Refactoring of Complex Software Systems Research:PresentProblem and Prospect [J]. Computer Science, 2020, 47(12): 1-10.
[5] ZHONG Lin-hui, FU Li-juan, YE Hai-tao, QI Jie, XU Jing. Study on Reverse Engineering Generation Method of Software Evolution History [J]. Computer Science, 2020, 47(11A): 549-556.
[6] PAN Hao, ZHENG Wei, ZHANG Zi-feng, LU Chao-qun. Study on Fractal Features of Software Networks [J]. Computer Science, 2019, 46(2): 166-170.
[7] TANG Qian-wen, CHEN Liang-yu. Analysis of Java Open Source System Evolution Based on Complex Network Theory [J]. Computer Science, 2018, 45(8): 166-173.
[8] YU Yong,KANG Qing-yi,CHEN Chang-geng,KAN Shi-lin,LUO Yong-jun. Bisecting K-means Clustering Method Based on Cohesion and Coupling [J]. Computer Science, 2018, 45(6A): 460-464.
[9] LIU Li-qian, DONG Dong. Long Method Detection Based on Cost-sensitive Integrated Classifier [J]. Computer Science, 2018, 45(11A): 497-500.
[10] ZHENG Jiao-jiao, LI Tong, LIN Ying, XIE Zhong-wen, WANG Xiao-fang, CHENG Lei, LIU Miao. Judgement Method of Evolution Consistency of Component System [J]. Computer Science, 2018, 45(10): 189-195.
[11] MA Sai and DONG Dong. Detection of Large Class Based on Latent Semantic Analysis [J]. Computer Science, 2017, 44(Z6): 495-498.
[12] DU Bo, YU Yan and DAI Gang. Study on Multi-collaborative Filtering Algorithm of Command Information Based on Cloud Models [J]. Computer Science, 2017, 44(Z11): 470-475.
[13] HE Yun, WANG Wei and LI Tong. Formal Method for Describing Software Evolution Ability Feature [J]. Computer Science, 2017, 44(7): 128-136.
[14] ZHAO Hui-qun and HUANG Yu-han. Program Verification of Software Model’s Algebraic Properties [J]. Computer Science, 2017, 44(11): 240-245.
[15] ZHONG Lin-hui, LI Jun-jie, XIA Jin and XUE Liang-bo. Research on Evolution Similarity Measurement of Component-based Software Based on Multi-dimensional Evolution Properties [J]. Computer Science, 2016, 43(Z11): 499-505.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!