计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 165-168.doi: 10.11896/jsjkx.200900168

• 大数据&数据科学 • 上一篇    下一篇

基于机器学习的股市拐点影响因素研究

袁钰坤1, 李刚1, 赵治翔1, 徐力2   

  1. 1 中证数据有限责任公司 北京100032
    2 中国科学院计算技术研究所中国科学院网络数据科学与技术重点实验室 北京100190
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 李刚(ligang@cmsmc.cn)
  • 作者简介:yuanyk@cmsmc.cn
  • 基金资助:
    国家自然科学基金(91746301,61902380);北京市科技新星计划(Z201100006820061)

Research on Factors Affecting Stock Inflection Point Based on Machine Learning Algorithms

YUAN Yu-kun1, LI Gang1, ZHAO Zhi-xiang1, XU Li2   

  1. 1 China Securities Data CO.,LTD,Beijing 100032,China
    2 Key Laboratory of Network Data Science & Technology,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:YUAN YU-kun,born in 1994,postgra-duate.His main research interests include machine learning and natural language processing.
    LI Gang,born in 1980,postgraduate.His main research interests include compute science and quantitative finance.
  • Supported by:
    National Natural Science Foundation of China(91746301,61902380) and Beijing Nova Program(Z201100006820061).

摘要: 股票市场的成交情况可以充分反映投资者的行为特征并影响整个股市的走势。股票成交明细数据作为股市最底层的交易数据,能够全面地体现股票交易的情况,成为至关重要的股票市场走势判断的参考数据,能够为资本市场监管者在风险监测领域进行决策提供有效帮助。文中提出了一种可以快速地在海量股票交易明细数据中提取投资者交易特征的方法,然后基于逻辑回归、决策树和随机森林等机器学习算法找到股市大盘较大拐点产生的主要影响因素,并预测交易特征变量对股市较大拐点产生的时间范围。在沪深股指上进行的实验表明,相较于传统的模型,文中提出的方法可以将股市较大拐点预测的准确度提高约10%,并在6个月的回测实验中准确率依旧保持在70%左右的水准,从而证明了模型的有效性。

关键词: 风险监测, 股票市场, 股市拐点, 机器学习, 走势判断

Abstract: Transaction situation in stock market can fully reflect behavior characteristics ofinvestors and affect the trend of entire stock market.As the bottom-level transaction data of stock market,detailed data of stock transaction can comprehensively reflect the situation of stock transactions and become a vital referencefor judgment of stock market trends.It can also provide regulators in capital market with effective information when making decisions in the field of risk monitoring.In this paper,we propose a method that can quickly extract the characteristics of investor transaction from detailed data of stock transaction,based on machine learning algorithms such as logistic regression,decision tree,and random forest,finding the main influencing factors of large inflection points and predictingtime range over which the larger inflection point occurs.The experimental results on the stock indexes of Shanghai and Shenzhen show that the proposed method can highly improve accuracy of prediction of large inflection point instock market by appoximately 10%,compared with a traditional model,and the accuracy rate in six-month backtesting experiment maintains a level of 70%,which demonstrates validity of the model in this paper.

Key words: Machine learning, Risk monitoring, Stock inflection point, Stock market, Trend judgement

中图分类号: 

  • TP391
[1] SONG L Z,HU H B.An Interpretation of the “Dodd-FrankAct” of the United States,with a Discussion on the Reference and Enlightenment to Country's Financial Supervision[J].Research on Macro Economy,2011(1):67-72.
[2] XIAO F.From Efficiency to Transparency-Small Discussion on EU “Financial Instruments Market Directive II” and other regulatory measures [J].Bond,2018(7):87-93.
[3] CSRC.The Securities Regulatory Commission officially released the overall construction plan for the implementation of regulatory technology[J].Information Technology and Informatization,2018(9):10.
[4] MCCULLOCH W,PITTS W.A logical calculus of the ideas immanentin nervous activity[J].The Bulletin of Mathematical Biophysics,1943,5(4):115- 113.
[5] SAMUEL A L.Some Studies in Machine Learning Using theGame of Checkers[J].IBM Journal of Research and Development,1959,3(3):210-229.
[6] LI X,XIE H,WANG R,et al.Empirical analysis:stock market prediction via extreme learning machine[J].Neural Computing and Applications,2016,27(1):67-78.
[7] EBADATI O M E,MORTAZAVI M T.An Efficient HybridMachine Learning Method For Time Series Stock Market Forecasting[J].Neural Network World,2018,28(1).
[8] XIE Q.Research Based on Stock Predicting Model of NeuralNetworks Ensemble Learning[C]//Shanghai University Of Engineering Science.Proceedings of 2018 2nd International Confe-rence on Electronic Information Technology and Computer Engineering (EITCE 2018).2018.
[9] GAO Z.Research on Financial Data Prediction Based on DeepAutoencoder[C]//International Informatization and Engineering Associations.Proceedings of 2019 2nd International Confe-rence on Financial Management,Education and Social Science(FMESS 2019).International Informatization and Engineering Associations:Computer Science and Electronic Technology International Society,2019:394-400.
[10] WANG W.Prediction of Hang Seng Index Based on Machine Learning[C]//Institute of Management Science and Industrial Engineering.Proceedings of 2019 3rd International Conference on Artificial intelligence,Systems,and Computing Technology(AISCT 2019).Institute of Management Science and Industrial Engineering:Computer Science and Electronic Technology International Society,2019:252-256.
[11] DENG S,WANG C,WANG M,et al.A gradient boosting deci-sion tree approach for insider trading identification:An empirical
model evaluation of China stock market[J].Applied Soft Computing Journal,2019,83.
[12] ZHOU F,ZHANG Q,SORNETTE D,et al.Cascading logistic regression onto gradient boosted decision trees for forecasting and trading stock indices[J].Applied Soft Computing Journal,2019,84.
[13] LÓPEZ-CABARCOS M A,PÉREZ-PICO A M,PIÑEIRO-CHOUSA J,et al.Bitcoin volatility,stock market and investor sentiment.Are they connected?[J].Finance Research Letters,2019.
[14] LI X,WU P,WANG W.Incorporating stock prices and news sentiments for stock market prediction:A case of Hong Kong[J].Information Processing and Management,2020.
[15] ZHAO C,YAO W.Stock Volatility Forecast Based on Financial Text Emotion[J].Computer Science,2020,47(5):79-83.
[1] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[4] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[5] 张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究
Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[6] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[7] 肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究
Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[8] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[9] 李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述
Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[10] 赵璐, 袁立明, 郝琨.
多示例学习算法综述
Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[11] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[12] 许杰, 祝玉坤, 邢春晓.
机器学习在金融资产定价中的应用研究综述
Application of Machine Learning in Financial Asset Pricing:A Review
计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127
[13] 么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明.
大数据驱动的社会经济地位分析研究综述
Big Data-driven Based Socioeconomic Status Analysis:A Survey
计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014
[14] 李野, 陈松灿.
基于物理信息的神经网络:最新进展与展望
Physics-informed Neural Networks:Recent Advances and Prospects
计算机科学, 2022, 49(4): 254-262. https://doi.org/10.11896/jsjkx.210500158
[15] 张潆藜, 马佳利, 刘子昂, 刘新, 周睿.
以太坊Solidity智能合约漏洞检测方法综述
Overview of Vulnerability Detection Methods for Ethereum Solidity Smart Contracts
计算机科学, 2022, 49(3): 52-61. https://doi.org/10.11896/jsjkx.210700004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!