计算机科学 ›› 2017, Vol. 44 ›› Issue (10): 193-202.doi: 10.11896/j.issn.1002-137X.2017.10.036

• 人工智能 • 上一篇    下一篇

基于多信息源的股价趋势预测

饶东宁,邓福栋,蒋志华   

  1. 广东工业大学计算机学院 广州510006,广东工业大学计算机学院 广州510006,暨南大学信息科学技术学院计算机科学系 广州510632
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受广东省自然科学基金(2016A030313084,6A030313700,4A030313374),中央高校基本科研业务费专项资金资助

Stock Price Movements Prediction Based on Multisources

RAO Dong-ning, DENG Fu-dong and JIANG Zhi-hua   

  • Online:2018-12-01 Published:2018-12-01

摘要: 股票价格及趋势预测是金融智能研究的热门话题。一直以来,各种各样的信息源被不断尝试用于股价预测,例如基本经济特征、技术指标、网络舆情、财务公告、财政新闻、金融研报等。然而,此类研究大多数只使用一种或两种信息源,使用3种及以上信息源的极为少见。信息源越多意味着能够提供更加丰富的信息内容和更多不同的信息层面。但是由于各种信源的本质不同,其对股票市场的影响程度不同,因此将多种信源融合起来进行股价预测并 非易事。此外,多信源也增加了维度灾难的风险。基于信息融合的目的,尝试同时利用基本经济特征、技术指标、网络舆情3种信息源来进行股价预测。具体做法:先对不同类型的信息源数据进行针对性的处理,使其形成统一的数据集,然后使用SVM分类器建立预测模型。实验结果表明,在选用线性核函数和考虑非交易日数据时,使用这3种信源组合的预测模型的预测效果要比使用单一信源或者两两组合的预测效果好。此外,在收集数据时发现,在非交易日(例如周末或停牌期)虽没有买卖但网络舆情剧增。因此,在实验数据中添加了非交易日的舆情情感数据,分类精准度有所提高。研究结果表明,基于多信源融合的股价预测虽然困难,但是在适当地选择特征和针对性地进行数据预处理后会有较好的预测效果。

关键词: 多信息源,股价趋势预测,SVM分类

Abstract: Predicting stock price movement is a hot topic in the financial intelligence field.So far,people have conti-nuously attempted to use various data sources in the stock price prediction,such as fundamental economic features,technical indicators,Internet public opinions,financial announcements,financial news,financial research reports and so on.However,most of the previous studies use only one or two distinct data sources to build prediction models.Few of them take advantage of three or more sources simultaneously.Undoubtedly,if more sources are provided,people can extract richer information content and consider more information levels.But,since the natures of various sources are distinct,and they have different effects on the stock market,it is not easy to converge several sources in predicting stock price.In addition,multisources naturally increase the risk of suffering the curse of dimensionality.Based on the idea of information fusion,this paper attempted to use three distinct sources to predict the stock price movement.The three sources are fundamental economic features,technical indicators and Internet public opinions.Our method firstly collects various source data,then implements the specific data preprocessing to form a unified data set,and finally uses the SVM classi-fier to build prediction models.Experimental results show that the preformance of prediction model based on the three sources is better than those which use a single source,or sources in pairs,when the linear core function for the SVM classifier is chosen and the data in the non-trading days are added.Besides,when collecting data,we found that the number of Internet public opinions rose sharply,although there were no transactions in the non-trading days (for example,weekends or the suspension period).Therefore,we added more text sentiment data showing the public opinions in the non-trading days and found that the prediction accuracy is improved.The study in this paper shows that although it is difficult to integrate multisources in the stock prediction,it is possible to produce a good predictor after the appropriate feature selection and the specific data preprocessing.

Key words: Multisources,Stock price movement prediction,SVM classification

[1] FAMA E F.Efficient Capital Markets:A Review of Theory and Empirical Work[J].Journal of Finance,1970,25(2):383-417.
[2] CAO Q,PARRY M E,LEGGIO K B.The three-factor model and artificial neural networks:predicting stock price movement in China[J].Annals of Operations Research,2011,185(185):25-44.
[3] GAO S J,XU D M,WANG H Q,et al.Knowledge-based anti-money laundering:A software agent bank application[J].Journal of Knowledge Management,2009,3(2):63-75.
[4] CUI B G,WANG H Q,YE K,et al.Intelligent agent-assistedadaptive order simulation system in the artificial stock market[J].Expert Systems with Applications,2012,39(10):8890-8898.
[5] CARHART M M.On Persistence in Mutual Fund Performance[J].Journal of Finance,1997,52(1):57-82.
[6] NOVY-MARX R.The other side of value:The gross profitability premium ☆[J].Journal of Financial Economics,2013,108(1):1-28.
[7] FAMA E F,FRENCH K R.A five-factor asset pricing model[J].Journal of Financial Economics,2014,116(1):1-22.
[8] FAMA E F.Market efficiency,long-term returns,and behavioral finance [J].Journal of Financial Economics,1998,49(3):283-306.
[9] SHILLER R J.From Efficient Market Theory to Behavioral Finance[J].Social Science Electronic Publishing,2003,17(1):83-104.
[10] LI X,XIE H,WANG R,et al.Empirical analysis:stock market prediction via extreme learning machine[J].Neural Computing & Applications,2014,27(1):67-78.
[11] PATEL J,SHAH S,THAKKAR P,et al.Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques[J].Expert Systems with Applications,2015,42(1):259-268.
[12] LEE M C.Using support vector machine with a hybrid feature selection method to the stock trend prediction[J].Expert Systems with Applications,2009,36(8):10896-10904.
[13] BOLLEN J,MAO H,ZENG X.Twitter mood predicts the stock market[J].Computer Science,2010,2(1):1-8.
[14] XU L.Empirical research on the impact of network public opi-nion on the stock price volatility [D].Chengdu:Southwestern University of Finance and Economics,2013.(in Chinese) 徐琳.网络舆情对股价波动影响的实证研究[D].成都:西南财经大学,2013.
[15] WANG L L.Clarification Announcement of listed companiesand share price volatility-based on the research of investor behavior [D].Nanjing:Nanjing University of Science & Techno-logy,2014.(in Chinese) 王莉莉.上市公司“澄清公告”与股价波动——基于投资者行为的研究[D].南京:南京理工大学,2014.
[16] YANG J.Empirical analysis of the impact of Internet financial news on stock-based on perspective of semantic analysis of company news [D].Chengdu:Southwestern University of Finance and Economics,2012.(in Chinese) 杨娟.互联网财经新闻对股票影响的实证分析——基于公司新闻语义分析的视角[D].成都:西南财经大学,2012.
[17] SHYNKEVICH Y,MCGINNITY T M,C OLEMAN S A,et al.Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning[J].Decision Support Systems,2016,85(C):74-83.
[18] DUAN J,LIN H,ZENG J.Posterior probability model for stock return prediction based on analyst’s recommendation behavior[J].Knowledge-Based Systems,2013,50:151-158.
[19] NEWMAN M R,GAMBLE G O,CHIN W W,et al.An Investigation of the Impact Publicly Available Accounting Data,Other Publicly Available Information and Management Guidance on Analysts’ Forecasts[M]∥New Perspectives in Partial Least Squares and Related Methods.New York:Springer,2013:315-339.
[20] DUAN J,ZENG J.Forecasting stock return using multiple information sources based on rules extraction[C]∥12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 15).Piscataway,New Jersey:IEEE,2015:1183-1188.
[21] ZHAI Y,HSU A,HALGAMUGE S K.Combining News and Technical Indicators in Daily Stock Price Trends Prediction[C]∥International Symposium on Neural Networks:Advances in Neural Networks.Springer-Verlag,2007:1087-1096.
[22] LI X,HUANG X,DENG X,et al.Enhancing quantitative intra-day stock return prediction by integrating both market news and stock prices information[J].Neurocomputing,2014,142(1):228-238.
[23] GUNDUZ H,CATALTEPE Z.Borsa Istanbul (BIST) dailyprediction using financial news and balanced feature selection[J].Expert Systems with Applications,2015,42(22):9001-9011.
[24] WU D,FUNG G P C,YU J X,et al.Integrating Multiple Data Sources for Stock Prediction[J].Web Information Systems Engineering,2008,5175:77-89.
[25] LI Q,CHEN Y,JIANG L L,et al.A tensor-based information framework for predicting the stock market[J].ACM Transactions on Information Systems,2016,4(2):11.
[26] FAMA E F,FRENCH K R.The Cross-Section of ExpectedStock Returns[J].Journal of Finance,1992,47(2):427-465.
[27] TSAI C,LIN Y,YEN D C,et al.Predicting stock returns by classifier ensembles[J].Applied Soft Computing,2011,11(2):2452-2459.
[28] LAM M.Neural network techniques for financial performance prediction:integrating fundamental and technical analysis[J].Decision Support Systems,2004,37(4):567-581.
[29] SIDOROV G,VELASQUEZ F,S TAMATATOS E,et al.Syntactic N-grams as machine learning features for natural language processing[J].Expert Systems with Applications,2014,41(3):853-860.
[30] WU H C,LUK R W P,WONG K F,et al.Interpreting TF-IDF term weights as making relevance decisions[J].ACM Transactions on Information Systems,2008,26(3):55-59.
[31] RADZIMSKI M,SNCHEZ-CERVANTES J L,CUADRADO J L L,et al.Predicting stocks returns correlations based on unstructured data sources[C]∥Joint Proceedings of the Second International Workshop on Semantic Web Enterprise Adoption and Best Practice and Second International Workshop on Finance and Economics on the Semantic Web Co-Located with European Semantic Web Conference.Anissaras,Greece,May.2014.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[4] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[5] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[6] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[7] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[8] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[9] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[10] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99, 116 .