计算机科学 ›› 2020, Vol. 47 ›› Issue (6): 51-58.doi: 10.11896/jsjkx.190300140
何鹏1,2, 喻绿君1
HE Peng1,2, YU Lv-jun1
摘要: 开源软件项目因门槛低、自由度高,在开发过程中存在进度缓慢、效率低下和项目质量偏低等问题;同时,软件峭壁(Software Cliff Wall)作为一种判定项目鲁棒性的依据,表现为软件开发过程中在短时间内完成远超过常规增量开发的一种代码贡献行为,是软件演化过程中可持续发展的一种潜在威胁。为了深入研究开源项目的开发过程,更准确地刻画软件演化,从而提高软件开发效率,分析软件峭壁的成因是一种行之有效的方法。实验以GitHub上9个时间跨度至少有5年的开源软件项目为研究对象,分别以月份和季度为周期,基于150 000多个commits上开发者的关注与评论信息构建开发者合作网络(Deve-loper Collaboration Networks,DCN),将代码行数超过1万行的单次commit视为软件峭壁,并从网络规模、网络结构、网络质量3个方面,利用节点数、连边数、节点更新率、模块度、平均路径长度、平均度、节点入度指数、节点出度均值、多样性这9个度量指标来分析软件开发过程中DCN与软件峭壁的关系。研究结果表明:1)当开发团队规模偏小,且成员更新幅度较大时,容易形成软件峭壁;2)保持开发者之间良好的“小世界”特性,有助于避免峭壁的产生;3)以季度为周期来分析软件开发过程中DCN与软件峭壁的关系更为合适,且开发团队成员的组织来源多样化也会在一定程度上促进软件峭壁的产生。
中图分类号:
[1]BROWN A W,BOOCH G.Reusing Open-Source Software and Practices:The Impact of Open-Source on Commercial Vendors[C]//International Conference on Software Reuse.2002:123-136. [2]YANG B,YU Q,ZHANG W,et al.Influence Factors Correlation Analysis in Github Open Source Software Development Process[J].Journal of Software,2017,28(6):1330-1342. [3]HE P,LI B,YANG X H,et al.Research On Developer Preferential Collaboration in Open-Source Software Community[J].Computer Science,2015,42(2):161-166. [4]ZHOU M H.Looking for micro-process in large-scale data[C]//Proceedings of the 2nd International Workshop on Evidential Assessment of Software Technologies.New York:ACM,2012:39-42. [5]KALLIAMVAKOU E,GOUSIOS G,BLINCOE K,et al.The Promises and Perils of Mining Github[C]//Proceedings of the 11th Working Conference on Mining Software Repositories.New York:ACM,2014:92-101. [6]LI W P,WANG J B,LIN Z Q,et al.Software Knowledge Graph Building Method for Open Source Project[J].Journal of Frontiers of Computer Science & Technology,2017,11(6):851-862. [7]JUNG H W,KIM S G,CHUNG C S.Measuring Software Product Quality:A Survey of ISO/IEC 9126[J].IEEE Software,2004,21(5):88-92. [8]GIRBEA A,SUCIU C,NECHIFOR S,et al.Design and Implementation of a Service-Oriented Architecture for the Optimization of Industrial Applications[J].IEEE Transactions on Industrial Informatics,2014,10(1):185-196. [9]MENEELY A,WILLIAMS L,SNIPES W,et al.Predicting Failures with Developer Networks and Social Network Analysis[C]//Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering.New York:ACM,2008:13-23. [10]LIMAM N,BOUTABA R.Assessing Software Service Quality and Trustworthiness at Selection Time[J].IEEE Transactions on Software Engineering,2010,36(4):559-574. [11]MACLEAN A C,PRATT L J,KREIN J L,et al.Trends that Affect Temporal Analysis using Sourceforge data[C]//Proceedings of the 5th International Workshop on Public Data about Software Development.North Carolina,USA,2010:6-11. [12]MACLEAN A C.Commit Patterns and Threats to Validity in Analysis of Open Source Software Repositories[D].Utah:Brigham Young University,2012. [13]PRATT L J.Cliff Walls:Threats to Validity in Empirical Studies of Open Source Forges[D].Utah:Brigham Young University,2013. [14]CHENG C,LI B,LI Z Y,et al.Developer Role Evolution in Open Source Software Ecosystem:An Explanatory Study on GNOME[J].Journal of Computer Science and Technology,2017,32(2):396-414. [15]PRATT L J,MACLEAN A C,KNUTSON C D,et al.Cliff Walls:An Analysis of Monolithic Commits Using Latent Dirichlet Allocation[C]//IFIP International Conference on Open Source Systems.Springer,2011:282-298. [16]CHEN D,WANG X,HE P,et al.Towards Understanding Existing Developers’ Collaborative Behavior In OSS Communities[J].ComputerScience,2016,43(6A):476-479. [17]GRÖNLUND M,JEFFORD-BAKER J.Measuring correlation between commit frequency and popularity on GitHub[D].Stockholm:KTH Royal Institute of Technology,2017. [18]SINHA V S,MANI S,SINHA S.Entering the circle of trust:developer initiation as committers in open-source projects[C]//Proceedings of the 8th Working Conference on Mining Software Repositories.2011:133-142. [19]MA Y T,WU Y,XU Y W.Dynamics of open-source software developer's commit behavior[C]//Proceedings of the 29th Annual ACM Symposium on Applied Computing(SAC’14).New York,USA:ACM Press,2014:1171-1173. [20]GOUSIOS G.The GHTorent dataset and tool suite[C]//Proceedings of the 10th Working Conference on Mining Software Repositories.2013:233-236. [21]Struggling in IT.GitHub 2018 Annual Report[EB/OL].http://www.wh-ford.com/f828820/20181030A1WJZ800.html. [22]HINDLE A,GERMAN D M,HOLT R C,et al.Automatic Classification of Large Changes into Maintenance Categories[C]//IEEE International Conference on Program Comprehension.IEEE,2009:99-108. [23]ARAFAT O,RIEHLE D.The Commit Size Distribution of Open Source Software[C]//Hawaii International Conference on System Science.IEEE,2009:1-8. [24]GU Q,CHEN D X.Validation and Simulation of Software System Evolution Rules Using Software Networks[J].Scientia Sinica Informationis,2014,44(1):20-36. [25]GU Q,XIONG S J,CHEN D X.Correlations between characteristics of maximum influence and degree distributions in software networks[J].SCIENCE CHINA Information Sciences,2014,57(7):1-12. [26]HE P,WANG P,LI B,et al.An Evolution Analysis of Software System Based On Multi-Granularity Software Network[J].Acta Electronica Sinica,2018,46(2):257-267. [27]PAN W F,LI B,MA Y T,et al.Multi-Granularity Evolution Analysis of Software Using Complex Network Theory[J].Journal of Systems Science and Complexity,2011,24(6):1068-1082. [28]NEWMAN M E J.Fast Algorithm for Detecting Community Structure in Networks[J].Physical Review E,2003,69(6):066133. |
[1] | 范家宽, 王皓月, 赵生宇, 周添一, 王伟. 数据驱动的开源贡献度量化评估与持续优化方法 Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions 计算机科学, 2021, 48(5): 45-50. https://doi.org/10.11896/jsjkx.201000107 |
[2] | 张久杰, 陈超, 聂宏轩, 夏玉芹, 张丽萍, 马占飞. 基于类粒度的克隆代码群稳定性实证研究 Empirical Study on Stability of Clone Code Sets Based on Class Granularity 计算机科学, 2021, 48(5): 75-85. https://doi.org/10.11896/jsjkx.200900062 |
[3] | 王继文, 吴毅坚, 彭鑫. 基于演化和语义特征的上帝类检测方法 Approach of God Class Detection Based on Evolutionary and Semantic Features 计算机科学, 2021, 48(12): 59-66. https://doi.org/10.11896/jsjkx.210100077 |
[4] | 张静宣, 江贺. 代码标识符归一化研究现状及发展趋势 Research Status and Development Trend of Identifier Normalization 计算机科学, 2020, 47(3): 1-4. https://doi.org/10.11896/jsjkx.200200009 |
[5] | 张超,毛新军,卢遥. 基于特征提取的开源社区Fork摘要自动生成方法 Approach of Automatic Fork Summary Generation in Open Source Community Based on Feature Extraction 计算机科学, 2020, 47(3): 25-33. https://doi.org/10.11896/jsjkx.191000087 |
[6] | 卢冬冬, 吴洁, 刘鹏, 盛永祥. 开源软件关键开发者类型及协作网络鲁棒性分析 Analysis of Key Developer Type and Robustness of Collaboration Network in Open Source Software 计算机科学, 2020, 47(12): 100-105. https://doi.org/10.11896/jsjkx.200300147 |
[7] | 钟林辉, 扶丽娟, 叶海涛, 齐杰, 徐静. 软件演化历史的逆向工程生成方法研究 Study on Reverse Engineering Generation Method of Software Evolution History 计算机科学, 2020, 47(11A): 549-556. https://doi.org/10.11896/jsjkx.200200067 |
[8] | 王扩, 王忠杰. 众包协作流程的恢复方法 Crowdsourcing Collaboration Process Recovery Method 计算机科学, 2020, 47(10): 19-25. https://doi.org/10.11896/jsjkx.191200164 |
[9] | 潘浩, 郑巍, 张紫枫, 芦超群. 软件网络分形结构特征研究 Study on Fractal Features of Software Networks 计算机科学, 2019, 46(2): 166-170. https://doi.org/10.11896/j.issn.1002-137X.2019.02.026 |
[10] | 唐倩文, 陈良育. 基于复杂网络理论的Java开源系统演化分析 Analysis of Java Open Source System Evolution Based on Complex Network Theory 计算机科学, 2018, 45(8): 166-173. https://doi.org/10.11896/j.issn.1002-137X.2018.08.030 |
[11] | 郑交交, 李彤, 林英, 谢仲文, 王晓芳, 成蕾, 刘妙. 构件系统演化一致性的判定方法 Judgement Method of Evolution Consistency of Component System 计算机科学, 2018, 45(10): 189-195. https://doi.org/10.11896/j.issn.1002-137X.2018.10.035 |
[12] | 赵会群,黄榆涵. 软件模型代数性质的程序化验证 Program Verification of Software Model’s Algebraic Properties 计算机科学, 2017, 44(11): 240-245. https://doi.org/10.11896/j.issn.1002-137X.2017.11.036 |
[13] | 陈丹,王星,何鹏,曾诚. 开源社区中已有开发者的合作行为分析 Towards Understanding Existing Developers’ Collaborative Behavior in OSS Communities 计算机科学, 2016, 43(Z6): 476-479. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.112 |
[14] | 钟林辉,李俊杰,夏鲸,薛良波. 基于多维属性的构件化软件演化相似性度量方法研究 Research on Evolution Similarity Measurement of Component-based Software Based on Multi-dimensional Evolution Properties 计算机科学, 2016, 43(Z11): 499-505. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.112 |
[15] | 钱晔,李彤,郁涌,孙吉红,于倩,彭琳. 一种面向同步交互的软件演化过程建模方法 Approach to Modeling Software Evolution Process for Synchronous Interaction 计算机科学, 2016, 43(8): 154-158. https://doi.org/10.11896/j.issn.1002-137X.2016.08.032 |
|