计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 99-108.doi: 10.11896/jsjkx.220400289
蒋竞, 平源, 吴秋迪, 张莉
JIANG Jing, PING Yuan, WU Qiu-di, ZHANG Li
摘要: Gitcoin是一个基于开源社区GitHub的众包平台。在Gitcoin中,项目团队可以发布开发任务,开发者选择感兴趣的任务并注册,发布者选择合适的开发者完成任务并发放赏金。但是一些任务因缺乏注册者而失败,部分任务未能合格完成,顺利完成的任务也面临开发者注册间隔时间长的问题。因此,需要一种开发者推荐方法,快速为众包任务发现合适的开发人员,缩短开发者注册众包任务的时间,发现潜在合适的开发者并激励其注册,促进众包任务顺利完成。文中提出了一种基于LGBM分类算法的开发者推荐方法DEVRec(Developer Recommendation)。该方法提取任务特征、开发者特征、开发者和任务的关系特征,使用LGBM分类算法进行二分类,计算开发者注册任务的概率,最终得到众包任务的推荐人员列表。为了评估推荐效果,获取Gitcoin的1 599个已完成众包任务、343名任务发布者和1 605名开发者。实验结果显示,与对比方法Policy Model相比,DEVRec前1位、前3位、前5位和前10位推荐的准确度及MRR指标分别提高了73.11%,119.07%,86.55%,29.24%和62.27%。
中图分类号:
[1]DABBISH L,STUART C,TSAY J,et al.Social coding inGitHub:transparency and collaboration in an open software repository[C]//Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work.2012:1277-1286. [2]DUBEY A,ABHINAV K,VIRDI G.A framework to preserve confidentiality in crowdsourced software development[C]//2017 IEEE/ACM 39th International Conference on Software Engineering Companion(ICSE-C).IEEE,2017:115-117. [3]BAO L,XIA X,LO D,et al.A large scale study of long-timecontributor prediction for github projects[J].IEEE Transactions on Software Engineering,2019,47(6):1277-1298. [4]WANG Q Y,XIA X,LO D,et al.Why Is My Code ChangeAbandoned?[J].Information and Software Technology,2019,110(JUN.):108-120. [5]SAREMI R L,YANG Y E,RUHE G,et al.Leveraging crow-dsourcing for team elasticity:An empirical evaluation at Topco-der[C]//2017 IEEE/ACM 39th International Conference on Software Engineering:Software Engineering in Practice Track(ICSE-SEIP).IEEE,2017:103-112. [6]ARCHAK N.Money,glory and cheap talk:analyzing strategic behavior of contestants in simultaneous crowdsourcing contests on TopCoder.com[C]//Proceedings of the 19th International Conference on World Wide Web.2010:21-30. [7]SAXTON G D,OH O,KISHORE R.Rules of Crowdsourcing:Models,Issues,and Systems of Control[J].Information Systems Management,2013,30(1/2):2-20. [8]RUI L L,ZHANG P,HUANG H Q,et al.A trust-based incentive mechanism for crowdsourcing [J].Journal of Electronics Information Technology,2016,38(7):1808-1815. [9]SAXTON G D,OH O,KISHORE R.Rules of Crowdsourcing:Models,Issues,and Systems of Control[J].Information Systems Management,2013,30(1/2):2-20. [10]HASTEER N,NAZIR N,BANSAL A,et al.CrowdsourcingSoftware Development:Many Benefits Many Concerns[J].Procedia Computer Science,2016,78:48-54. [11]FU Y,SUN H L,YE L T.Competition-aware task routing for contest based crowdsourced software development[C]//2017 6th International Workshop on Software Mining(Software Mi-ning).IEEE,2017:32-39. [12]JIANG J,WU Q,CAO J,et al.Recommending tags for pull requests in GitHub[J].Information and Software Technology,2021,129:106394. [13]WANG Z Z,SUN H L,FU Y,et al.Recommending crowd- sourced software developersin consideration of skill improvement[C]//2017 32nd IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE,2017:717-722. [14]BABA Y,KINOSHITA K,KASHIMA H.Participation recommendation system for crowdsourcing contests[J].Expert Systems with Applications,2016,58:174-183. [15]JIANG J,LO D,ZHENG J T,et al.Who should make decision on this pull request? Analyzing time-decaying relationships and file similarities for integrator prediction[J].Journal of Systems and Software,2019,154:196-210. [16]YANG Y.Code review decision maker recommendation and result prediction research [D].Beijing:BeiHang University,2018. [17]MAO K,YANG Y,WANG Q,et al.Developer recommendation for crowdsourced software development tasks[C]//2015 IEEE Symposium on Service-Oriented System Engineering.IEEE,2015:347-356. [18]ZHANG Z Y,SUN H L,ZHANG H Y.Developer recommendation for Topcoder through a meta-learning based policy model[J].Empirical Software Engineering,2020,25(1):859-889. [19]YANG Y,KARIM M R,SAREMI R,et al.Who should take this task?dynamic decision support for crowd workers[C]//Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement.2016:1-10. [20]JIANG J,YANG Y,HE J H,et al.Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development-Science Direct[J].Information & Software Technology,2017,84(C):48-62. [21]HANNEBAUER C,PATALAS M,STÜNKEL S,et al.Automatically recommending code reviewers based on their exper-tise:An empiricalcomparison[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.2016:99-110. [22]BEGEL A,HERBSLEB J D,STOREY M A.The future of collaborative software development[C]//Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work Companion.2012:17-18. [23]CUI C,HU M Q,WEIR J D,et al.A recommendation system for meta-modeling:A meta-learning based approach[J].Expert Systems with Applications,2016,46(Mar.):33-44. [24]HAUFF C,GOUSIOS G.Matching GitHub developer profilesto job advertisements[C]//2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.IEEE,2015:362-366. [25]WAN Y,CHEN L,XU G D,et al.SCSMiner:mining social co-ding sites for software developer recommendation with relevance propagation[J].World Wide Web,2018,21(6):1523-1543. [26]GOUSIOS G,ZAIDMAN A,STOREY M A,et al.Work practices and challenges in pull-based development:the integrator’s perspective[C]//2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.IEEE,2015:358-368. [27]ALELYANI T,YANG Y.Software crowdsourcing reliability:an empirical study on developers behavior[C]//Proceedings of the 2nd International Workshop on Software Analytics.2016:36-42. |
[1] | 冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028 |
[2] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 |
[3] | 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240 |
[4] | 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094 |
[5] | 张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203 |
[6] | 张源, 康乐, 宫朝辉, 张志鸿. 基于Bi-LSTM的期货市场关联交易行为检测方法 Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM 计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304 |
[7] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[8] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[9] | 陈明鑫, 张钧波, 李天瑞. 联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning 计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079 |
[10] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[11] | 李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208 |
[12] | 赵璐, 袁立明, 郝琨. 多示例学习算法综述 Review of Multi-instance Learning Algorithms 计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047 |
[13] | 肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270 |
[14] | 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮. 一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion 计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103 |
[15] | 王飞, 黄涛, 杨晔. 基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion 计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030 |
|