计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211000017-7.doi: 10.11896/jsjkx.211000017
岑健铭1,2, 封全喜1,2, 张丽丽1, 佟锐超1
CEN Jian-ming1,2, FENG Quan-xi1,2, ZHANG Li-li1, TONG Rui-chao1
摘要: “高送转”现象指上市公司转增较大比例的股票。针对上市公司实施“高送转”现象的预测问题,文中提出了一种基于差分进化算法超参数优化的lightGBM模型(简记为DE-lightGBM)。该模型主要包括两个方面:首先,利用差分进化算法调整lightGBM模型的损失函数中少数类别的权重以及正则项系数,以处理数据类别不平衡的问题;其次,以F1和AUC作为评价指标,再次利用差分进化算法优化li-ghtGBM模型的重要超参数变量,找到一组预测效果最优的参数组合。数值结果显示,DE-lightGBM模型取得了较好的效果,F1和AUC值分别为0.536 8和0.873 4。提出的DE-lightGBM模型能够有效识别下一年将会实施“高送转”的上市公司。
中图分类号:
| [1]CHE Z C,ZHAO Y X,GUAN S.Analysis on the Trend and Characteristics of “High Delivery” Policy of Listed Companies [J].Friends of Accounting,2013,17:26-31. [2]LIU Y,YE D L.High Transfer,Corporate Performance and Executive Reduction Scale[J].Collected Essays on Finance and Economics,2019,9:62-72. [3]LI C,HU Z Y,SHI S R.Research on Irrational Speculative Bubble Model Based on Stock Market Investor Sentiment [J].The Theory and Practice of Finance and Economics,2018,39(5):51-57. [4]KRIEGER K,PETERSON D R.Predicting Stock Splits withthe Help of Firm-specific Experiences[J].Journal of Economics and Finance,2009,33(4):410-421. [5]XIONG Y M,CHEN X.Research on the Motivation of Turn-over Behavior of Chinese Listed Companies--Based on the Test of High Turn-Over Samples[J].Research on Economics and Management,2012,5:81-88. [6]SHI H,TING X Y.Prediction Model of “High Delivery andTurn” Based on Pattern Recognition [J].Times Finance,2016,12:289-290. [7]WANG K,LONG W J.Research on High Stock Transfer Based on Integrated Learning [J].Times Finance,2016,36:163-164,167. [8]DONG K M,ZHAO S S.Research on the Motivation of “High Turnover” of Chinese Listed Companies--Based on BP Neural Network Model Method analysis [J].Review of Investment Studies,2018,1:139-153. [9]CHEN J W,CHEN Y X,FAN W H.Research on the InfluenceFactors of High Turnover of Listed Companies Based on Data Mining [J].China Computer & Communication,2020,14:162-164. [10]LI Y,FANG Z Q.Research on Enterprise High Transfer Me-thod Based on Machine Learning [J].Digital Space,2020,10:220-221. [11]ZHANG T H,LUO K Y.An Empirical Study on High Turnover Forecasting of Listed Companies Based on Integrated Learning [J].Computer Engineering and Applications,2021,57(4):1-7. [12]YANG J,OLAFSSON S.Optimization-based Feature Selection with Adaptive Instance Sampling[J].Computers & Operations Research,2006,33(11):3088-3106. [13]RESHEF D N,RESHEF Y A,FINUCANE H K,et al.Detecting Novel Associations in Large Data Sets[J].Science,2011,334(6062):1518. [14]YANG Q W.Overview of Differential Evolution Algorithms[J].Pattern Recognition and Artificial Intelligence,2008,4(21):506-513. [15]SONG L L,WANG S H,YANG C,et al.Application Research of Improved XGBoost in Unbalanced Data Processing [J].Computer Science,2020,47(6):98-103. [16]YAN S X,ZHU P,LIU Z.Research on Vehicle Fault Prediction Method Based on Improved LightGBM Model [J].Automotive Engineering,2020,42(6):815-819,825. [17]TANG K,QIN M,ZHAO X,et al.Prediction of Gaseous Nitrite Based on Stacking Integrated Learning Model [J].China Environmental Science,2020,40(2):582-590. [18]Al DAOUD E.Comparison Between XGBoost,LightGBM and CatBoost Using a Home Credit Dataset[J].International Journal of Computer and Information Engineering,2019,13(1):6-10. [19]第八届“泰迪杯”数据挖掘挑战赛赛题[EB/OL].https://www.tipdm.org/bdrace/index.html. [20]CHEN S L,SHEN S Q,LI D S.Integrated Learning Method for Unbalanced Data Based on Updating Sample Weights [J].Computer Science,2018,45(7):31-37. [21]KAUR H,PANNU H S,MALHI A K.A Systematic Review on Imbalanced Data Challenges in Machine Learning:Applications and Solutions[J].ACM Computing Surveys(CSUR),2019,52(4):1-36. [22]ZHOU Z H.Machine Learning[M].Beijing:Tsinghua University Press,2016. [23]LI G H,LI J Q,ZHANG L,et al.A Feature Selection Method Based on Ant Colony Algorithm and Random Forest [J].Computer Science,2019,46(S2):212-215. [24]BERGSTRA J,BARDENET R,BENGIO R,et al.Algorithms for hyper-parameter optimization[C]//Advances in Neural Information Processing Systems.2011. | 
| [1] | 冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028 | 
| [2] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 | 
| [3] | 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240 | 
| [4] | 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094 | 
| [5] | 张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203 | 
| [6] | 陈明鑫, 张钧波, 李天瑞. 联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning 计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079 | 
| [7] | 李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208 | 
| [8] | 赵璐, 袁立明, 郝琨. 多示例学习算法综述 Review of Multi-instance Learning Algorithms 计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047 | 
| [9] | 刘宝宝, 杨菁菁, 陶露, 王贺应. 基于DE-LSTM模型的教育统计数据预测研究 Study on Prediction of Educational Statistical Data Based on DE-LSTM Model 计算机科学, 2022, 49(6A): 261-266. https://doi.org/10.11896/jsjkx.220300120 | 
| [10] | 肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270 | 
| [11] | 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮. 一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion 计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103 | 
| [12] | 王飞, 黄涛, 杨晔. 基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion 计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030 | 
| [13] | 许杰, 祝玉坤, 邢春晓. 机器学习在金融资产定价中的应用研究综述 Application of Machine Learning in Financial Asset Pricing:A Review 计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127 | 
| [14] | 么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明. 大数据驱动的社会经济地位分析研究综述 Big Data-driven Based Socioeconomic Status Analysis:A Survey 计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014 | 
| [15] | 李野, 陈松灿. 基于物理信息的神经网络:最新进展与展望 Physics-informed Neural Networks:Recent Advances and Prospects 计算机科学, 2022, 49(4): 254-262. https://doi.org/10.11896/jsjkx.210500158 | 
| 
 | ||