Computer Science ›› 2022, Vol. 49 ›› Issue (12): 195-204.doi: 10.11896/jsjkx.210600029

• Database & Big Data & Data Science • Previous Articles     Next Articles

Integrating XGBoost and SHAP Model for Football Player Value Prediction and Characteristic Analysis

LIAO Bin1, WANG Zhi-ning2, LI Min2, SUN Rui-na2,3,4   

  1. 1 College of Big Data Statistics,Guizhou University of Finance and Economics,Guiyang 550025,China
    2 College of Statistics and Data Science,Xinjiang University of Finance and Economics,Urumqi 830012,China
    3 Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China
    4 School of Networks Security,University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2021-06-03 Revised:2021-10-22 Published:2022-12-14
  • About author:LIAO Bin,born in 1986,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include deep learning,data mi-ning and big data computing model.WANG Zhi-ning,born in 1994,postgraduate.His main research interests include machine learning and big data.
  • Supported by:
    National Natural Science Foundation of China(61562078),Xinjiang “Tianshan Cedar Plan” Young Top Talent Reserve Project:Research on Machine Learning Frontier Algorithm and Its Application and Scientific Research Program of Colleges and Universities in Xinjiang(XJEDU2021Y037).

Abstract: With the increasing globalization of football,the global player transfer market is becoming more and more prosperous.However,as the most important factor affecting player transfer transaction,the player’s transfer value lacks in-depth model and application research.In this paper,the FIFA’s official player database is taken as the research object.Firstly,on the premise of distinguishing different player positions,Box-Cox transformation,F-Score feature selection,etc.are used to perform feature processing on the original data set.Secondly,the player value prediction model is constructed by XGBoost,and compared with the main machine learning algorithms such as random forest,AdaBoost,GBDT and SVR for 10-fold cross validation experiments.Experimental results prove that the XGBoost model has a performance advantage over the existing models on the indicators of R2,MAE and RMSE.Finally,on the basis of constructing the value prediction model,this paper integrates the SHAP framework to analyze the important factors affecting the players’ value score in different positions,and provides decision support for some scenarios,such as player’s value score evaluation,comparative analysis,and training strategy formulation,etc.

Key words: Machine learning, Player’s value prediction, Training strategy, XGBoost algorithm, SHAP value

CLC Number: 

  • TP391
[1]Football Clubs’Valuation:The European Elite 2020[EB/OL].(2020-05-28)[2020-10-13].http://www.footballbenchmark.com/library/football_clubs_valuation_the_european_elite_2020.
[2]Global Transfer Market Report 2020[EB/OL].(2020-01-18)[2020-10-13].http://img.fifa.com/image/upload/ijiz9rtpkfnbhxwbqr70.pdf.
[3]AO X Q,GONG Y J,LI J.Prediction of soccer match results based on handicapdata[J].Journal of Chongqing Technology Business University(Natural Science),2016,33(6):86-89.
[4]NAZIM R,AIDA M,ROSHIDI D,et al.A Review on football match outcome prediction using bayesian networks [J].Journal of Physics:Conference Series,2018,1020(1):1-9.
[5]LEONARDO E,FRANCESCO P,NICOLA T.Combining historical data and bookmakers’ odds in modelling football scores[J].Statistical Modelling,2018,18(6):1-24.
[6]XIA Z C,YANG G B,ZHANG Z Y,et al.Video adaptationscheme for football sports video on mobile terminals[J].Journal of Chinese Computer Systems,2011,32(8):1660-1664.
[7]TONG M,DING L W,JI C L.Fusion of HCRF and AAM highlight events detection in soccer videos[J].Journal of Computer Research and Development,2014,51(1):225-236.
[8]YU J Q,ZHANG Q,WANG Z K,et al.Soccer highlight detection based on replay and affection arousal model[J].Chinese Journal of Computers,2014,37(6):1268-1280.
[9]CHAWLA S,ESTEPHAN J,GUDMUNDSSON J,et al.Classification of passes in football matches using spatiotemporal data[J].ACM Transactions on Spatial Algorithms and Systems,2017,3(6):11-25.
[10]GOES F R,KEMPE M,MEERHOFF L A,et al.Not every pass can be an assist:a data-driven model to measure pass effectiveness in professional soccer matches[J].Big Data,2018,7(1):57-70.
[11]REIN R,RAABE D,MEMMERT D.‘Which pass is better?’ Novel approaches to assess passing effectiveness in elite soccer[J].Hum Movement Science,2017,55(10):172-181.
[12]HERM S,CALLSEN-BRACKER H M,KREIS H.When thecrowd evaluates soccer players’ market values:Accuracy and evaluation attributes of an online community[J].Sport Management Review,2014,17(4):484-492.
[13]SCELLES N,HELLEU B,DURAND C,et al.Professionalsports firm values:Bringing new determinants to the foreground?A study of European soccer,2005-2013[J].Journal of Sports Economics,2014,17(7):1-18.
[14]WAN B.Study on the transfer of the super league players inwinter of the 2016 Season[J].Bulletinof Sport Science & Technology,2016,24(9):107-109.
[15]ROSSETTI G,CAPRONI V.Football Market Strategies:Think Locally,Trade Globally [C]//IEEE 16th International Confe-rence on Data Mining Workshops (ICDMW).Barcelona,Spain:IEEE,2016:152-159.
[16]CHEN C.The model construction of transfer price about football forward players in China football association super league[D].Beijing:Beijing Sport University,2017.
[17]YE X S,MA L,CHEN J T,et al.Study on the inter-team gap of players’ market value in Chinese football association super league[J].China Sport Science and Technology,2017,53(3):63-70.
[18]OLIVER M,ALEXANDER S,MARKUS W.Beyond crowdjudgments:data-driven estimation of market value in association football[J].European Journal of Operational Research,2017,263(2):611-624.
[19]PRABHNOOR S,PUNEET S L.Influence of crowd-sourcing,popularity and previous year statistics in market value estimation of football players[J].Journal of Discrete Mathematical Sciences & Cryptography,2019,22(2):113-126.
[20]KIRSCHSTEIN T,STEFFEN L.Assessing the market values of soccer players-a robust analysis of data from German 1.and 2.Bundesliga[J].Journal of Applied Statistics,2019,46(7):1336-1349.
[21]ZHAO Y.Analysis of professional soccer player transfer market based on complex network theory[D].Nanjing:Southeast University,2018.
[22]IMAN B,SEYED M R.A novel machine learning method for estimating football players’ value in the transfer market[J].Soft Computing,2020,25(10):2499-2511.
[23]HUO D.Evaluation of the value of basketball players based on wireless network and improved Bayesian algorithm[J].EURASIP Journal on Wireless Communications and Networking,2020,236(9):1-11.
[24]CHEN T,GUESTRIN C.XGBoost:A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:785-794.
[25]SONG L L,WANG S H,YANG C,et al.Application research of improved XGBoost in imbalanced data processing[J].Computer Science,2020,47(6):98-103.
[26]LI B S,LI L Z,SUN Y,et al.Intranet defense algorithm based on pseudo boosting decision tree[J].Computer Science,2018,45(4):157-162.
[27]LUNDBERG S M,LEE S I.A unified approach to interpreting model predictions[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.ACM,2017:4765-4774.
[28]CHEN Y W,LIN C J.Combining SVMs with various selection strategies[J].Studies in Fuzziness and Soft Computing,Berlin:Springer,2008:315-324.
[1] LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214.
[2] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[3] HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[4] ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343.
[5] LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[6] CHEN Ming-xin, ZHANG Jun-bo, LI Tian-rui. Survey on Attacks and Defenses in Federated Learning [J]. Computer Science, 2022, 49(7): 310-323.
[7] LI Ya-ru, ZHANG Yu-lai, WANG Jia-chen. Survey on Bayesian Optimization Methods for Hyper-parameter Tuning [J]. Computer Science, 2022, 49(6A): 86-92.
[8] ZHAO Lu, YUAN Li-ming, HAO Kun. Review of Multi-instance Learning Algorithms [J]. Computer Science, 2022, 49(6A): 93-99.
[9] WANG Fei, HUANG Tao, YANG Ye. Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion [J]. Computer Science, 2022, 49(6A): 784-789.
[10] XIAO Zhi-hong, HAN Ye-tong, ZOU Yong-pan. Study on Activity Recognition Based on Multi-source Data and Logical Reasoning [J]. Computer Science, 2022, 49(6A): 397-406.
[11] YAO Ye, ZHU Yi-an, QIAN Liang, JIA Yao, ZHANG Li-xiang, LIU Rui-liang. Android Malware Detection Method Based on Heterogeneous Model Fusion [J]. Computer Science, 2022, 49(6A): 508-515.
[12] XU Jie, ZHU Yu-kun, XING Chun-xiao. Application of Machine Learning in Financial Asset Pricing:A Review [J]. Computer Science, 2022, 49(6): 276-286.
[13] YAO Xiao-ming, DING Shi-chang, ZHAO Tao, HUANG Hong, LUO Jar-der, FU Xiao-ming. Big Data-driven Based Socioeconomic Status Analysis:A Survey [J]. Computer Science, 2022, 49(4): 80-87.
[14] LI Ye, CHEN Song-can. Physics-informed Neural Networks:Recent Advances and Prospects [J]. Computer Science, 2022, 49(4): 254-262.
[15] ZHANG Ying-li, MA Jia-li, LIU Zi-ang, LIU Xin, ZHOU Rui. Overview of Vulnerability Detection Methods for Ethereum Solidity Smart Contracts [J]. Computer Science, 2022, 49(3): 52-61.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!