Computer Science ›› 2019, Vol. 46 ›› Issue (8): 224-232.doi: 10.11896/j.issn.1002-137X.2019.08.037

• Software & Database Technology • Previous Articles     Next Articles

Method for Identifying and Recommending Reconstructed Clones Based on Software Evolution History

SHE Rong-rong, ZHANG Li-ping   

  1. (College of Computer and Information Engineering,Inner Mongolia Normal University,Hohhot 010022,China)
  • Received:2018-06-26 Online:2019-08-15 Published:2019-08-15

Abstract: The research on the existing clone code reconstruction is limited to a single version of static analysis while ignoring the evolution process of the cloned code,resulting in a lack of effective methods for reconstructing the cloned code.Therefore,this paper firstly extracted the evolution history information closely related to the clone code from clone detection,clone mapping,clone family and software maintenance log management system.Secondly,the clone code that needs to be reconstructed was identified,and the traced clone code was identified at the same time.Then,static features and evolution features were extracted and reconstructed and a feature sample database was built.Finally,a variety of machine learning methods were used to compare and select the best classifier recommended reconstruction of clones.In this paper,experiments were performed on nearly 170 versions of 7 software.The results show that the readiness for reconstructing cloned code is more than 90%.It provides more accurate and reasonable code reconstruction suggestions for software development and maintenance personnel

Key words: Code clone, Clone refactoring, Clone tracking, Clone family, Feature extraction

CLC Number: 

  • TP311.5
[1] BALAZINSKA M,MERLO E,DAGENAIS M,et al.Advanced Clone-analysis to Support Object-oriented System Refactoring[C]∥Proceedings of the Seventh Working Conference on Reverse Engineering.IEEE press,2000:98-107.
[2] KIM M,SAZAWAL V,NOTKIN D,et al.An empirical study of code clone genealogies [J].AcmSigsoft Software Engineering Notes,2005,30(5):187-196.
[3] ROY C K,CORDY J R,KOSCHKE R.Comparison and evaluation of code clone detection techniques and tools:A qualitative approach[J].Science of Computer Programming,2009,74(7):470-495.
[4] ROY C K,CORDY J R.Near-miss function clones in open source software:an empirical study[J].Journal of Software Maintenance & Evolution Research & Practice,2010,22(3):165-189.
[5] BASIT H A,PUGLISII S J,SMYTH W F,et al.Efficient token based clone detection with flexible tokenization[C]∥TheJoint Meeting on European Software Engineering Conference and the ACM Sigsoft Symposium on the Foundations of Software Engineering:Companion Papers.ACM,2007:513-516.
[6] DUALA-EKOKO E,ROBILLARD M P.Clonetracker:tool support for code clone management[C]∥Proceedings of the 2008 International Conference on Software Engineering.New York:ACM,2008:843-846.
[7] APDAN M,AKTAS M,YIGITI M.On the Structural Code Clone Detection Problem:A Survey and Software Metric Based Approach[C]∥Computational Science and Its Applications(ICCSA 2014).Springer International Publishing,2014:492-507.
[8] CUOMO A,SANTONE A,VILLANO U.A novel approach based on formal methods for clone detection[C]∥Proceedings of the 2012 International Workshop on Software Clones.Pisca-taway,NJ:IEEE,2012:8-14.
[9] CALEFATO F,LANUBILE F,MALLARDO T.Function clone detection in web applications:a semiautomatedapproach[J].Journal of Web Engineering,2004,3(1):3-21.
[10] ZHANG J J,WANG C H,ZHANG L P,et al.Clone code detection based on Token edit distance [J].Journal of Computer Applications,2015(12):3536-3543.(in Chinese) 张久杰,王春晖,张丽萍,等.基于Token编辑距离检测克隆代码[J].计算机应用,2015(12):3536-3543.
[11] BARBOUR L,KHOMH F,ZOU Y.Late propagation in software clones[C]∥Proceedings of the 27th IEEE International Conference on Software Maintenance.Washington DC:IEEE Computer Society,2011:273-282.
[12] SAHA R K,ROY C K,SCHNEIDER K A.An automatic framework for extracting and classifying near-miss clone genealogies[C]∥IEEE International Conference on Software Maintenance.IEEE,2011:293-302.
[13] HOTTA K,HIGO Y,KUSUMOTO S.Clone Tracking based on Similarity of CRD[J].Technical Report of IeiceSs,2013:113-117.
[14] ZHANG R X,ZHANG L P,WANG CH,et al.Clonal group mapping method based on topic modeling technology[J].Computer Engineering and Design,2015(6):1524-1529.(in Chinese) 张瑞霞,张丽萍,王春晖,等.基于主题建模技术的克隆群映射方法[J].计算机工程与设计,2015(6):1524-1529.
[15] GÖDE N,KOSCHKE R.Incremental Clone Detection[C]∥European Conference on Software Maintenance & Reengineering.2009:219-228.
[16] GE G S,LIU D S,HOU M.Software multi-version clonal group mapping method based on LDA and DBSCAN[J].Journal of Computer Applications,2017,34(2):481-486.(in Chinese) 葛广帅,刘东升,侯敏.基于LDA和DBSCAN的软件多版本克隆群映射方法[J].计算机应用研究,2017,34(2):481-486.
[17] BARBOUR L,KHOMH F,ZOU Y.An empirical study of faults in late propagation clone genealogies[J].Journal of Software:Evolution and Process,2013,25(11):1139-1165.
[18] KIM M,SAZAWAL V,NOTKKIN D,et al.An Empirical Study of Code Clone Genealogies[C]∥Proceedings of the 2005 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering.New York:ACM,2005:187-196.
[19] SAHA R K,ROY C K,SCHNEIDER K A.An automatic framework for extracting and classifying near-miss clone genealogies[C]∥Proceedings of the 2011 IEEE International Conference on Software Maintenance.Piscataway,NJ:IEEE,2011:293-302.
[20] MENG C,SU X H,WANG T T,et al.A New Clone Group Mapping Algorithm for Extracting Clone Genealogy on Multi-version Software[C]∥International Conference on Instrumentation.2013:848-853.
[21] GE G S,LIU D S,ZHANG L P,et al.Evolutionary Trace Construction and Pattern Recognition of Clone Code Based on Graph Model[J].Computer Engineering,2017,43(5):47-54.(in Chinese) 葛广帅,刘东升,张丽萍,等.基于图模型的克隆代码演化痕迹构建及模式识别[J].计算机工程,2017,43(5):47-54.
[22] OPDYKE W F.Refactoring Object Frame Works [M].Illinois:University of Illinois at Urban-Champaign,1992:18-35.
[23] BIAN Y X.Research on Process Extraction Method of Reconfigurable Clone Code [D].Harbin:Harbin Institute of Techno-logy,2014.(in Chinese) 边奕心.可重构克隆代码的过程提取方法研究[D].哈尔滨:哈尔滨工业大学,2014.
[24] BAKOTA T.Tracking the Evolution of Code Clones[C]∥The 37th International Conference on Current Trends in Theory and Practice of Computer Science.Novy′Smokovec,Slovakia:Sprin-ger,2011:86-98.
[25] MONDAL M,ROY C K,SCHNEIDER K A.SPCP-Miner:A tool for mining code clones that are important for refactoring or tracking[C]∥IEEE,International Conference on Software Analysis,Evolution and Reengineering.IEEE,2015:484-488.
[26] HIGO Y,KUSUMOTO S,INOUE K.A metric-based approach to identifying refactoring opportunities for merging code clones in a Java software system[J].Journal of Software Maintenance and Evolution:Research and Practice,2008,20(6):435-461.
[27] LIU D R,LIU D S,ZHANG L P,et al.Prediction of cloned code quality based on Bayesian network[J].Computer Science,2017,44(4):165-168.(in Chinese) 刘冬瑞,刘东升,张丽萍,等.基于贝叶斯网络预测克隆代码质量[J].计算机科学,2017,44(4):165-168.
[28] SHE R R,ZHANG L P,HOU M,et al.Method for recommending clone reconstruction based on decision tree[J].Journal of Computer Applications,2018,38(7):213-219,245.(in Chinese) 折蓉蓉,张丽萍,侯敏,等.基于决策树推荐克隆重构的方法[J].计算机应用,2018,38(7):213-219,245.
[29] STEIDL D.Feature-based detection of bugs in clones[C]∥International Workshop on Software Clones.IEEE,2013:76-82.
[30] WANG H,ZHANG L P,YAN S,et al.Feature selection model in cloned code harmful prediction[J].Journal of Computer Applications,2017,37(4):1135-1142.(in Chinese) 王欢,张丽萍,闫盛,等.克隆代码有害性预测中的特征选择模型[J].计算机应用,2017,37(4):1135-1142.
[31] WANG W,GODFREY M W.Recommending Clones for Refactoring Using Design,Context,and History[C]∥IEEE International Conference on Software Maintenance and Evolution.IEEE Computer Society,2014:331-340.
[1] LIU Yang, JIN Zhong. Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism [J]. Computer Science, 2021, 48(1): 197-203.
[2] OUYANG Peng, LU Lu, ZHANG Fan-long, QIU Shao-jian. Cross-project Clone Consistency Prediction via Transfer Learning and Oversampling Technology [J]. Computer Science, 2020, 47(9): 10-16.
[3] BAO Yu-xuan, LU Tian-liang, DU Yan-hui. Overview of Deepfake Video Detection Technology [J]. Computer Science, 2020, 47(9): 283-292.
[4] WANG Liang, ZHOU Xin-zhi, YNA Hua. Real-time SIFT Algorithm Based on GPU [J]. Computer Science, 2020, 47(8): 105-111.
[5] LIANG Zheng-you, HE Jing-lin, SUN Yu. Three-dimensional Convolutional Neural Network Evolution Method for Facial Micro-expression Auto-recognition [J]. Computer Science, 2020, 47(8): 227-232.
[6] YANG Wei-chao, GUO Yuan-bo, LI Tao, ZHU Ben-quan. Method Based on Traffic Fingerprint for IoT Device Identification and IoT Security Model [J]. Computer Science, 2020, 47(7): 299-306.
[7] LAN Zhang-li, SHEN De-xing, CAO Juan and ZHANG Yu-xin. Content-independent Method for Basis Image Extraction and Image Reconstruction [J]. Computer Science, 2020, 47(6A): 226-229.
[8] ZHOU Li-peng, MENG Li-min, ZHOU Lei, JIANG Wei and DONG Jian-ping. Fall Detection Algorithm Based on BP Neural Network [J]. Computer Science, 2020, 47(6A): 242-246.
[9] YUAN De-yu, ZHANG Yi-fan, GAO Jian and SUN Hai-chun. Abnormal User Detection Method in Sina Weibo Based on User Feature Extraction [J]. Computer Science, 2020, 47(6A): 364-368.
[10] DENG Yi-jiao, ZHANG Feng-li, CHEN Xue-qin, AI Qing, YU Su-zhe. Collaborative Attention Network Model for Cross-modal Retrieval [J]. Computer Science, 2020, 47(4): 54-59.
[11] ZHAO Nan, PI Wen-chao, XU Chang-qiao. Video Recommendation Algorithm for Multidimensional Feature Analysis and Filtering [J]. Computer Science, 2020, 47(4): 103-107.
[12] WANG Kun-lun, LIU Wen-can, HE Xiao-hai, QING Lin-bo, WU Xiao-hong. Motion Feature Descriptor for Abnormal Behavior Detection [J]. Computer Science, 2020, 47(4): 119-124.
[13] ZHANG Dan,LUO Ping. Survey of Code Similarity Detection Methods and Tools [J]. Computer Science, 2020, 47(3): 5-10.
[14] CHEN Li-fu,LIU Yan-zhi,ZHANG Peng,YUAN Zhi-hui,XING Xue-min. Road Extraction Algorithm of Multi-feature High-resolution SAR Image Based on Multi-Path RefineNet [J]. Computer Science, 2020, 47(3): 156-161.
[15] HE Chao-lei,BI Xiu-li,XIAO Bin. Zernike Moment Based Approach for Local Feature Detection [J]. Computer Science, 2020, 47(2): 135-142.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .