Computer Science ›› 2021, Vol. 48 ›› Issue (5): 75-85.doi: 10.11896/jsjkx.200900062

• Computer Software • Previous Articles     Next Articles

Empirical Study on Stability of Clone Code Sets Based on Class Granularity

ZHANG Jiu-jie1, CHEN Chao1, NIE Hong-xuan1, XIA Yu-qin1, ZHANG Li-ping2, MA Zhan-fei1   

  1. 1 Department of Computer Science & Technology,Baotou Teachers' College,Baotou,Inner Mongolia 014030,China
    2 School of Computer Science & Technology,Inner Mongolia Normal University,Hohhot 010022,China
  • Received:2020-09-08 Revised:2020-12-02 Online:2021-05-15 Published:2021-05-09
  • About author:ZHANG Jiu-jie,born in 1990,master,is a member of China Computer Federation.His main research interests include software engineering,software maintenance and evolution,program source code analysis,clone code detection and management.(zhangjiujie@bttc.edu.cn)
    XIA Yu-qin,born in 1973,master,associate professor.Her main research interests include software engineering,artificial intelligence and computer education.
  • Supported by:
    National Natural Science Foundation of China(61762071,61462071) and Natural Science Foundation of the Inner Mongolia Autonomous Region,China(2014MS0613,2015MS0606,2016MS0614,2019MS06037).

Abstract: Researching on clone code is closely related to various problems in software engineering.The existing researches and studies on stability of clone code mainly focus on comparisons between clone code and non-clone code,or between different types of clone code.Rare studies consider the object-oriented classes in which clone sets distribute.This paper presents a comprehensive empirical study on stability of clone sets based on object-oriented class granularity.This paper frames four study problems about the stability of clone sets.Around these particular problems,all clone sets are categorized into three groups,intra-class clone sets,inter-class clone sets and hybrid-class clone sets.And stability of them is compared and analyzed by 9 evolution patterns from 4 perspectives during the process of software evolution.First of all,clone code fragments in all revisions of subject systems are detected and tagged with object-oriented classes where they distribute in.Next,clone sets between adjacent revisions are mapped based on mapping clone fragments,and evolution patterns of clone sets can be recognized and tagged.After that,clone genealogy is constructed by combing the results of mapping relations and evolution patterns,and then stability of three groups of clone sets is calculated from different perspectives.Eventually,differences of three groups are compared and analyzed.According to the experimental results on 7 700 revisions of seven diverse object-oriented subject systems,about 60% of intra-class clone sets have a life cycle more than half of the total number of reversions,the percentage of inter-class clone sets and hybrid clone sets that have a life cycle rate of 50% or more are both close to 35%.Comparatively speaking,among three kinds of clone sets,the frequency of changes within intra-class clone sets is the lowest.Also,there is a bit more merging,branching and late propagation evolution patterns in inter-class clone sets.And the frequency of fragments deletions,consistent changes and inconsistent changes is the highest in hybrid-class clone sets.Overall,stability of intra-class clone sets is the best,hybrid-class clone sets should be given a higherpriority to tracing or refactoring in the process of software evolution.The clone code stability analysis methods and findings from this work will provide a strong reference and support for clone code maintenance,tracking,refactoring and other cloning management related software activities.

Key words: Clone code, Hybrid-class clone, Inter-class clone, Intra-class-clone, Software evolution, Software maintenance, Stability

CLC Number: 

  • TP311.5
[1]ROY C K,ZIBRAN M F,KOSCHKE R.The Vision of Software Clone Management:Past,Present,and Future (Keynote Paper)[C]//2014 Software Evolution Week-IEEE Conference on Software Maintenance,Reengineering,and Reverse Engineering.IEEE,2014:18-33.
[2]KIM M,SAZAWAL V,NOTKIN D,et al.An Empirical Study of Code Clone Genealogies[C]//Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering.ACM SIGSOFT,2005:187-196.
[3]WALKER A,CERNY T,SONG E.Open-Source Tools andBenchmarks for Code-Clone Detection:Past,Present,and Future Trends[J].ACM SIGAPP Applied Computing Review,2020,19(4):28-39.
[4]FARAM F,SAINI V,YAND D,et al.On Precision of Code Clone Detection Tools[C]//2019 IEEE 26th International Conference on Software Analysis,Evolution and Reengineering (SANER).IEEE,2019:84-94.
[5]AIN Q U,BUTT W H,ANWAR M W,et al.A Systematic Review on Code Clone Detection[J].IEEE Access,2019,7:86121-86144.
[6]ROY C K,CORDY J R.NICAD:Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization[C]// Proceedings of the 16th IEEE International Conference on Program Comprehension.IEEE,2008:172-181.
[7]SAHA R K,ROY C K,SCHNEIDER K A.gCad:A Near-Miss Clone Genealogy Extractor to Support Clone Evolution Analysis[C]//2013 IEEE International Conference on Software Maintenance.IEEE,2013:488-491.
[8]BAKOTA T.Tracking the Evolution of Code Clones[C]//International Conference on Current Trends in Theory and Practice of Computer Science.Berlin,Heidelberg:Springer,2011:86-98.
[9]BARBOUR L,KHOMH F,ZOU Y.Late Propagation in Soft-ware Clones[C]//2011 27th IEEE International Conference on Software Maintenance (ICSM).IEEE,2011:273-282.
[10]ZHANG J J,ZHAI Y,WANG C H,et al.Evolution Pattern Recognition and Genealogy Construction Based on Clone Mapping of Versions[J].Journal of Computer Applications,2016(7):2021-2030.
[11]ZIBRAN M F,ROY C K.Conflict-Aware Optimal Scheduling of Code Clone Refactoring:A Constraint Programming Approach[C]// Proceedings of the 19th International Conference on Program Comprehension.IEEE,2011:266-269.
[12]KRINKE J.Is Cloned Code More Stable than Non-Cloned Code?[C]// Proceedings of the 8th IEEE International Working Conference on Source Code Analysis and Manipulation.IEEE,2008:57-66.
[13]KRINKE J.Is Cloned Code Older than Non-Cloned Code?[C]//Proceedings of the 5th International Workshop on Software Clones.ACM,2011:28-33.
[14]GÖDE N,HARDER J.Clone Stability[C]// Proceedings of the 15th European Conference on Software Maintenance and Reengineering.IEEE,2011:65-74.
[15]MONDAL M,ROY C K,SCHNEIDER K A.An EmpiricalStudy on Clone Stability[J].ACM SIGAPP Applied Computing Review,2012,12(3):20-36.
[16]LOZANO A,WERMELINGER M.Tracking Clones' Imprint[C]//Proceedings of the 4th International Workshop on Software Clones.IEEE,2010:65-72.
[17]LOZANO A,WERMELINGER M.Assessing the Effect ofClones on Changeability[C]//International Conference on Software Maintenance.IEEE,2008:227-236.
[18]MONDAL M,ROY C K,SCHNEIDER K A.Bug propagation through code cloning:An empirical study[C]//2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).IEEE,2017:227-237.
[19]RAHMAN M S,ROY C K.On the Relationships Between Stability and Bug-Proneness of Code Clones:An Empirical Study[C]//Proceedings of the 17th International Working Conference on Source Code Analysis and Manipulation (SCAM).IEEE,2017:131-140.
[20]BARBOUR L,KHOMH F,ZOU Y.An Empirical Study ofFaults in Late Propagation Clone Genealogies[J].Journal of Software:Evolution and Process,2013,25(11):1139-1165.
[21]SAJNANI H,SAINI V,LOPES C V.A Comparative Study of Bug Patterns in Java Cloned and Non-Cloned Code[C]// Proceedings of the 14th International Working Conference on Source Code Analysis and Manipulation.IEEE,2014:21-30.
[22]MONDAL M,ROY C K,SCHNEIDER K A.Bug-Proneness and Late Propagation Tendency of Code Clones:A Comparative Study on Different Clone Types[J].Journal of Systems and Software,2018,144:41-59.
[23]ELISH M O.On the Association Between Code Cloning and Fault-Proneness:An Empirical Investigation[C]//2017 Computing Conference.IEEE,2017:928-935.
[24]LIN Y,XING Z,PENG X,et al.ClonePedia:Summarizing Code Clones by Common Syntactic Context for Software Maintenance[C]//2014 IEEE International Conference on Software Maintenance and Evolution.IEEE,2014:341-350.
[25]MANN H B,WHITNEY D R.On a Test of Whether One of Two Random Variables is Stochastically Larger Than the Other[J].The Annals of Mathematical Statistics,1947,18(1):50-60.
[26]WALLACE D L.Simplified Beta-Approximations to theKruskal-Wallis H Test[J].Journal of the American Statistical Association,1959,54(285):225-230.
[1] ZHENG Wen-ping, LIU Mei-lin, YANG Gui. Community Detection Algorithm Based on Node Stability and Neighbor Similarity [J]. Computer Science, 2022, 49(9): 83-91.
[2] XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219.
[3] WANG Ji-wen, WU Yi-jian, PENG Xin. Approach of God Class Detection Based on Evolutionary and Semantic Features [J]. Computer Science, 2021, 48(12): 59-66.
[4] WANG Meng, DING Zhi-jun. New Device Fingerprint Feature Selection and Model Construction Method [J]. Computer Science, 2020, 47(7): 257-262.
[5] HE Peng, YU Lv-jun. Analysis of Open Source Software Cliff Walls for Group Collaborative Development [J]. Computer Science, 2020, 47(6): 51-58.
[6] ZHU Lin-li, HUA Gang, GAO Wei. Stability Analysis of Ontology Learning Algorithm in Decision Graph Setting [J]. Computer Science, 2020, 47(5): 43-50.
[7] ZHANG Jing-xuan, JIANG He. Research Status and Development Trend of Identifier Normalization [J]. Computer Science, 2020, 47(3): 1-4.
[8] ZHONG Lin-hui, FU Li-juan, YE Hai-tao, QI Jie, XU Jing. Study on Reverse Engineering Generation Method of Software Evolution History [J]. Computer Science, 2020, 47(11A): 549-556.
[9] PAN Hao, ZHENG Wei, ZHANG Zi-feng, LU Chao-qun. Study on Fractal Features of Software Networks [J]. Computer Science, 2019, 46(2): 166-170.
[10] YANG Jia-ning, HUANG Xiang-sheng, LI Zong-han, RONG Can, LIU Dao-wei. Spatio-temporal Trajectory Prediction of Power Grid Based on Double Layers Stacked Long Short-term Memory [J]. Computer Science, 2019, 46(11A): 23-27.
[11] CAI Yu-xin, GONG Si-liang, YANG Ming, TANG Zhi-wei, ZHAO Bo. Personnel Identification System Based on Mobile Police [J]. Computer Science, 2019, 46(11A): 446-449.
[12] TANG Qian-wen, CHEN Liang-yu. Analysis of Java Open Source System Evolution Based on Complex Network Theory [J]. Computer Science, 2018, 45(8): 166-173.
[13] ZHU Jiang, LEI Yun, WANG Yan. Stability Based Energy-efficient Routing Protocol in Cognitive Wireless Sensor Networks [J]. Computer Science, 2018, 45(11): 97-102.
[14] ZHENG Jiao-jiao, LI Tong, LIN Ying, XIE Zhong-wen, WANG Xiao-fang, CHENG Lei, LIU Miao. Judgement Method of Evolution Consistency of Component System [J]. Computer Science, 2018, 45(10): 189-195.
[15] ZHAI Yu-peng, HONG Mei and YANG Qiu-hui. Research on Traceability of Functional Requirements to Test Case [J]. Computer Science, 2017, 44(Z11): 480-484.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!