计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 62-70.doi: 10.11896/jsjkx.250300005

• 软件工程 • 上一篇    下一篇

OpenRank动力学:面向开源生态的影响力评估与动态传播模型

赵生宇1, 彭佳恒2, 王伟2, 黄帆2   

  1. 1 同济大学电子与信息工程学院 上海 200092
    2 华东师范大学数据科学与工程学院 上海 200062
  • 收稿日期:2025-02-25 修回日期:2025-06-13 出版日期:2025-08-15 发布日期:2025-08-08
  • 通讯作者: 王伟(wwang@dase.ecnu.edu.cn)
  • 作者简介:(frank_zsy@tongji.edu.cn)
  • 基金资助:
    国家自然科学基金(62137001);上海市教委数字化转型创新研究项目(40400-22201)

OpenRank Dynamics:Influence Evaluation and Dynamic Propagation Models for Open SourceEcosystems

ZHAO Shengyu1, PENG Jiaheng2, WANG Wei2, HUANG Fan2   

  1. 1 School of Electronic and Information Engineering,Tongji University,Shanghai 200092,China
    2 School of Data Science and Engineering,East China Normal University,Shanghai 200062,China
  • Received:2025-02-25 Revised:2025-06-13 Online:2025-08-15 Published:2025-08-08
  • About author:ZHAO Shengyu,born in 1988,Ph.D candidate.His main research interests include mining software repositories and open source software ecosystem network.
    WANG Wei,born in 1979,Ph.D,professor.His main research interests include open source measurements and computational education.
  • Supported by:
    National Natural Science Foundation of China(62137001) and Digital Transformation Innovation Research Project of Shanghai Municipal Education Commission(40400-22201).

摘要: 随着开源生态系统的快速发展,影响力评估已成为衡量开发者贡献和项目价值的重要工具。在开源生态中,复杂的异质网络结构使得传统静态评估方法难以全面捕捉节点间的影响力传播。为解决这一问题,提出了一种OpenRank动力学方法,结合静态评估与动态传播模型,从多维度和动态视角对开源社区中的节点影响力进行综合评估。首先,基于矩阵代数法和Pregel框架的图迭代法,实现了OpenRank算法在中小规模和大规模网络中的高效计算,确保了算法在不同规模网络中的适用性与高效性。其次,结合经典的独立级联模型(IC)、线性阈值模型(LT)和传染病模型(SIR),从传播机制的角度分析了影响力的传播规律、速度与范围,进一步弥补了传统静态评估方法在传播过程中的不足。实验结果表明,OpenRank 动力学方法在影响力传播效率和范围方面显著优于传统方法,并展现出良好的工程适配性和可扩展性。

关键词: 开源生态, 影响力评估, 动力学模型, 异质信息网络, OpenRank

Abstract: With the rapid development of the open source ecosystem,influence evaluation has become a critical tool for assessing developer contributions and project value.In open source communities,the complex heterogeneous network structures pose challenges for traditional static evaluation methods to comprehensively capture influence propagation among nodes.To address this issue,this paper proposes a OpenRank dynamic method that integrates static evaluation with dynamic propagation models to provide a multidimensional and dynamic assessment of node influence within open source communities.Firstly,the OpenRank algorithm is implemented using matrix algebra and the graph iteration method based on the Pregel framework,enabling efficient computation on both small- and large-scale networks and ensuring its scalability and adaptability.Secondly,by incorporating classic propagation models such as the Independent Cascade(IC) model,the Linear Threshold(LT) model,and the Susceptible-Infected-Recovered(SIR) model,this study analyzes influence propagation patterns,speed,and reach,addressing the limitations of traditional static evaluation methods.Experimental results demonstrate that the dynamic OpenRank method significantly outperforms traditional approaches in terms of influence propagation efficiency and reach.Additionally,it exhibits strong engineering adaptability and scalability.

Key words: Open source ecosystem, Influence evaluation, Dynamic models, Heterogeneous information network, OpenRank

中图分类号: 

  • TP391
[1]PU Q M,XI Z X,HUANG L R,et al.User influence evaluation algorithm based on GitHub[J].Journal of South-Central Minzu University(Natural Science Edition),2023,42(5):672-677.
[2]OKONG'O W,NDIEGE J R A.Knowledge sharing in open-source software development communities:a review and synthesis[J].VINE Journal of Information and Knowledge Management Systems,2023,55(3):622-649.
[3]SHI C,WANG R J,WANG X.A review of heterogeneous information network analysis and applications[J].Journal of Software,2022,33(2):598-621.
[4]YOU L,PENG J,WANG W,et al.Data Driven Visualized Analysis:Visualizing Global Trends of GitHub Developers with Fine-Grained Geo-Details[C]//International Conference on Database Systems for Advanced Applications.2024:498-502.
[5]KERMARREC A M,LE MERRER E,SERICOLA B,et al.Se-cond order centrality:Distributed assessment of nodes criticity in complex networks[J].Computer Communications,2011,34(5):619-628.
[6]LI J,WILLETT P.ArticleRank:a PageRank-based alternativeto numbers of citations for analysing citation networks[C]//Aslib Proceedings.Emerald Group Publishing Limited,2009:605-618.
[7]LI Y,LI C,CHEN W.Research on influence ranking of chinese movie heterogeneous network based on PageRank algorithm[C]//Web Information Systems and Applications:15th International Conference.Springer,2018:344-356.
[8]PAGE L,BRIN S,MOTWANI R,et al.The pagerank citation ranking:Bring order to the web[C]//Proceedings of the 7th International World Wide Web Conference.1998.
[9]ZHAO S,XIA X,FITZGERALD B,et al.OpenRank Leaderboard:Motivating Open Source Collaborations Through Social Network Evaluation in Alibaba[C]//Proceedings of the 46th International Conference on Software Engineering:Software Engineering in Practice.2024:346-357.
[10]TANG Y,ZHAO S,XIA X,et al.HyperCRX:A Browser Extension for Insights into GitHub Projects and Developers[C]//Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension.2024:460-464.
[11]HUANG W,XIA X,ZHOU A,et al.OSGraph:A Data Visualization Insight Platform for Open Source Community[C]//International Conference on Database Systems for Advanced Applications.Springer,2024:476-479.
[12]HAN F Y,BI F L,ZHANG Y B,et al.OpenPerf:A data science benchmark system for sustainable development of open source ecosystem [J].Journal of Computers,2025,48(3):632-649.
[13]WANG L,TIAN Y,DU J.Idea dynamics on social networks[J].Science China:Information Sciences,2018,48(1):3-23.
[14]ZHONG E,FAN W,ZHU Y,et al.Modeling the dynamics ofcomposite social networks[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2013:937-945.
[15]JAMALI M,LAKSHMANAN L.Heteromf:recommendation in heterogeneous information networks using context dependent factor models[C]//Proceedings of the 22nd International Conference on World Wide Web.2013:643-654.
[16]LONG B,ZHANG Z,YU P S.Co-clustering by block value decomposition[C]//Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mi-ning.2005:635-640.
[17]SUN Y,HAN J,YAN X,et al.Pathsim:Meta path-based top-k similarity search in heterogeneous information networks[J].Proceedings of the VLDB Endowment,2011,4(11):992-1003.
[18]SHI C,PHILIP S Y.Heterogeneous information network analysis and applications[M].Cham:Springer,2017.
[19]KONG X,CAO B,YU P S.Multi-label classification by mining label and instance correlations from heterogeneous information networks[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2013:614-622.
[20]WANG R,SHI C,YU P S,et al.Integrating clustering andranking on hybrid heterogeneous information network[C]//Advances in Knowledge Discovery and Data Mining:17th Pacific-Asia Conference(PAKDD 2013).Berlin:Springer,2013:583-594.
[21]LYU L,ZHANG Y C,YEUNG C H,et al.Leaders in social networks,the delicious case[J].PLoS One,2011,6(6):e21202.
[22]WANG J,HUANG W,SHENGYU Z,et al.OpenRank contribution evaluation method and empirical study in open-source course[J].Journal of East China Normal University(Natural Science),2024,2024(5):11.
[23]CASTELLANO C,FORTUNATO S,LORETO V.Statisticalphysics of social dynamics[J].Reviews of Modern Physics,2009,81(2):591-646.
[24]SCHNEIDER J J,HIRTREITER C.The impact of election results on the member numbers of the large parties in Bavaria and Germany[J].International Journal of Modern Physics C,2005,16(8):1165-1215.
[25]FRASCA P,ISHII H,RAVAZZI C,et al.Distributed rando-mized algorithms for opinion formation,centrality computation and power systems estimation:A tutorial overview[J].Euro-pean Journal of Control,2015,24:2-13.
[26]DEGROOT M H.Reaching a consensus[J].Journal of theAmerican Statistical association,1974,69(345):118-121.
[27]QUATTROCIOCCHI W,CALDARELLI G,SCALA A.Opi-nion dynamics on interacting networks:media competition and social influence[J].Scientific Reports,2014,4(1):4938.
[28]RICHARDSON M,DOMINGOS P.Mining knowledge-sharingsites for viral marketing[C]//Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2002:61-70.
[29]KEMPE D,KLEINBERG J,TARDOS É.Maximizing the spread of influence through a social network[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2003:137-146.
[30]SAITO K,NAKANO R,KIMURA M.Prediction of information diffusion probabilities for independent cascade model[C]//International Conference on Knowledge-based and Intelligent Information and Engineering Systems.Berlin:Springer,2008:67-75.
[31]CHEN W,YUAN Y,ZHANG L.Scalable influence maximization in social networks under the linear threshold model[C]//2010 IEEE International Conference on Data Mining.IEEE,2010:88-97.
[32]BORGS C,BRAUTBAR M,CHAYES J,et al.Maximizing social influence in nearly optimal time[C]//Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms.Society for Industrial and Applied Mathematics,2014:946-957.
[33]TANG Y,XIAO X,SHI Y.Influence maximization:Near-optimal time complexity meets practical efficiency[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Ma-nagement of Data.2014:75-86.
[34]TANG Y,SHI Y,XIAO X.Influence maximization in near-linear time:A martingale approach[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data.2015:1539-1554.
[35]GOMEZ-RODRIGUEZ M,LESKOVEC J,KRAUSE A.Inferring networks of diffusion and influence[J].ACM Transactions on Knowledge Discovery from Data,2012,5(4):1-37.
[36]MYERS S,LESKOVEC J.On the convexity of latent social network inference[C]//Proceedings of the 24th International Conference on Neural Information Processing Systems.2010:1741-1749.
[37]ZHANG S,SUN J,LIN W,et al.Information Diffusion Meets Invitation Mechanism[C]//Proceedings of the ACM on Web Conference 2024.2024:383-392.
[38]HUANG K,GAO R,CAUTIS B,et al.Scalable Continuous-time Diffusion Framework for Network Inference and Influence Estimation[C]//Proceedings of the ACM on Web Conference 2024.2024:2660-2671.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!