计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 62-70.doi: 10.11896/jsjkx.250300005
赵生宇1, 彭佳恒2, 王伟2, 黄帆2
ZHAO Shengyu1, PENG Jiaheng2, WANG Wei2, HUANG Fan2
摘要: 随着开源生态系统的快速发展,影响力评估已成为衡量开发者贡献和项目价值的重要工具。在开源生态中,复杂的异质网络结构使得传统静态评估方法难以全面捕捉节点间的影响力传播。为解决这一问题,提出了一种OpenRank动力学方法,结合静态评估与动态传播模型,从多维度和动态视角对开源社区中的节点影响力进行综合评估。首先,基于矩阵代数法和Pregel框架的图迭代法,实现了OpenRank算法在中小规模和大规模网络中的高效计算,确保了算法在不同规模网络中的适用性与高效性。其次,结合经典的独立级联模型(IC)、线性阈值模型(LT)和传染病模型(SIR),从传播机制的角度分析了影响力的传播规律、速度与范围,进一步弥补了传统静态评估方法在传播过程中的不足。实验结果表明,OpenRank 动力学方法在影响力传播效率和范围方面显著优于传统方法,并展现出良好的工程适配性和可扩展性。
中图分类号:
[1]PU Q M,XI Z X,HUANG L R,et al.User influence evaluation algorithm based on GitHub[J].Journal of South-Central Minzu University(Natural Science Edition),2023,42(5):672-677. [2]OKONG'O W,NDIEGE J R A.Knowledge sharing in open-source software development communities:a review and synthesis[J].VINE Journal of Information and Knowledge Management Systems,2023,55(3):622-649. [3]SHI C,WANG R J,WANG X.A review of heterogeneous information network analysis and applications[J].Journal of Software,2022,33(2):598-621. [4]YOU L,PENG J,WANG W,et al.Data Driven Visualized Analysis:Visualizing Global Trends of GitHub Developers with Fine-Grained Geo-Details[C]//International Conference on Database Systems for Advanced Applications.2024:498-502. [5]KERMARREC A M,LE MERRER E,SERICOLA B,et al.Se-cond order centrality:Distributed assessment of nodes criticity in complex networks[J].Computer Communications,2011,34(5):619-628. [6]LI J,WILLETT P.ArticleRank:a PageRank-based alternativeto numbers of citations for analysing citation networks[C]//Aslib Proceedings.Emerald Group Publishing Limited,2009:605-618. [7]LI Y,LI C,CHEN W.Research on influence ranking of chinese movie heterogeneous network based on PageRank algorithm[C]//Web Information Systems and Applications:15th International Conference.Springer,2018:344-356. [8]PAGE L,BRIN S,MOTWANI R,et al.The pagerank citation ranking:Bring order to the web[C]//Proceedings of the 7th International World Wide Web Conference.1998. [9]ZHAO S,XIA X,FITZGERALD B,et al.OpenRank Leaderboard:Motivating Open Source Collaborations Through Social Network Evaluation in Alibaba[C]//Proceedings of the 46th International Conference on Software Engineering:Software Engineering in Practice.2024:346-357. [10]TANG Y,ZHAO S,XIA X,et al.HyperCRX:A Browser Extension for Insights into GitHub Projects and Developers[C]//Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension.2024:460-464. [11]HUANG W,XIA X,ZHOU A,et al.OSGraph:A Data Visualization Insight Platform for Open Source Community[C]//International Conference on Database Systems for Advanced Applications.Springer,2024:476-479. [12]HAN F Y,BI F L,ZHANG Y B,et al.OpenPerf:A data science benchmark system for sustainable development of open source ecosystem [J].Journal of Computers,2025,48(3):632-649. [13]WANG L,TIAN Y,DU J.Idea dynamics on social networks[J].Science China:Information Sciences,2018,48(1):3-23. [14]ZHONG E,FAN W,ZHU Y,et al.Modeling the dynamics ofcomposite social networks[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2013:937-945. [15]JAMALI M,LAKSHMANAN L.Heteromf:recommendation in heterogeneous information networks using context dependent factor models[C]//Proceedings of the 22nd International Conference on World Wide Web.2013:643-654. [16]LONG B,ZHANG Z,YU P S.Co-clustering by block value decomposition[C]//Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mi-ning.2005:635-640. [17]SUN Y,HAN J,YAN X,et al.Pathsim:Meta path-based top-k similarity search in heterogeneous information networks[J].Proceedings of the VLDB Endowment,2011,4(11):992-1003. [18]SHI C,PHILIP S Y.Heterogeneous information network analysis and applications[M].Cham:Springer,2017. [19]KONG X,CAO B,YU P S.Multi-label classification by mining label and instance correlations from heterogeneous information networks[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2013:614-622. [20]WANG R,SHI C,YU P S,et al.Integrating clustering andranking on hybrid heterogeneous information network[C]//Advances in Knowledge Discovery and Data Mining:17th Pacific-Asia Conference(PAKDD 2013).Berlin:Springer,2013:583-594. [21]LYU L,ZHANG Y C,YEUNG C H,et al.Leaders in social networks,the delicious case[J].PLoS One,2011,6(6):e21202. [22]WANG J,HUANG W,SHENGYU Z,et al.OpenRank contribution evaluation method and empirical study in open-source course[J].Journal of East China Normal University(Natural Science),2024,2024(5):11. [23]CASTELLANO C,FORTUNATO S,LORETO V.Statisticalphysics of social dynamics[J].Reviews of Modern Physics,2009,81(2):591-646. [24]SCHNEIDER J J,HIRTREITER C.The impact of election results on the member numbers of the large parties in Bavaria and Germany[J].International Journal of Modern Physics C,2005,16(8):1165-1215. [25]FRASCA P,ISHII H,RAVAZZI C,et al.Distributed rando-mized algorithms for opinion formation,centrality computation and power systems estimation:A tutorial overview[J].Euro-pean Journal of Control,2015,24:2-13. [26]DEGROOT M H.Reaching a consensus[J].Journal of theAmerican Statistical association,1974,69(345):118-121. [27]QUATTROCIOCCHI W,CALDARELLI G,SCALA A.Opi-nion dynamics on interacting networks:media competition and social influence[J].Scientific Reports,2014,4(1):4938. [28]RICHARDSON M,DOMINGOS P.Mining knowledge-sharingsites for viral marketing[C]//Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2002:61-70. [29]KEMPE D,KLEINBERG J,TARDOS É.Maximizing the spread of influence through a social network[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2003:137-146. [30]SAITO K,NAKANO R,KIMURA M.Prediction of information diffusion probabilities for independent cascade model[C]//International Conference on Knowledge-based and Intelligent Information and Engineering Systems.Berlin:Springer,2008:67-75. [31]CHEN W,YUAN Y,ZHANG L.Scalable influence maximization in social networks under the linear threshold model[C]//2010 IEEE International Conference on Data Mining.IEEE,2010:88-97. [32]BORGS C,BRAUTBAR M,CHAYES J,et al.Maximizing social influence in nearly optimal time[C]//Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms.Society for Industrial and Applied Mathematics,2014:946-957. [33]TANG Y,XIAO X,SHI Y.Influence maximization:Near-optimal time complexity meets practical efficiency[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Ma-nagement of Data.2014:75-86. [34]TANG Y,SHI Y,XIAO X.Influence maximization in near-linear time:A martingale approach[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data.2015:1539-1554. [35]GOMEZ-RODRIGUEZ M,LESKOVEC J,KRAUSE A.Inferring networks of diffusion and influence[J].ACM Transactions on Knowledge Discovery from Data,2012,5(4):1-37. [36]MYERS S,LESKOVEC J.On the convexity of latent social network inference[C]//Proceedings of the 24th International Conference on Neural Information Processing Systems.2010:1741-1749. [37]ZHANG S,SUN J,LIN W,et al.Information Diffusion Meets Invitation Mechanism[C]//Proceedings of the ACM on Web Conference 2024.2024:383-392. [38]HUANG K,GAO R,CAUTIS B,et al.Scalable Continuous-time Diffusion Framework for Network Inference and Influence Estimation[C]//Proceedings of the ACM on Web Conference 2024.2024:2660-2671. |
|