计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 47-51.doi: 10.11896/j.issn.1002-137X.2016.09.008
• 2015 年第三届CCF 大数据学术会议 • 上一篇 下一篇
薛伟,张文生,任俊宏
XUE Wei, ZHANG Wen-sheng and REN Jun-hong
摘要: 考虑一类学习问题,问题的目标函数可表示为大量组函数的平均,并且假设每一个组件函数都是光滑的。在众多机器学习方法中,在线学习操作流程简洁、收敛速度快,而且可以实现模型的自动更新,为大数据的学习提供了有利的工具。针对这类问题,提出了一种基于随机谱梯度下降(Stochastic Spectral Gradient Descent,S2GD)的在线学习方法。该方法利用Rayleigh商收集目标函数的二阶信息来构造Hessian阵逆的近似。S2GD方法可以看作是谱梯度方法从确定性优化到随机优化的延伸。算法每次迭代所产生的搜索方向具有下降性,且现有结论表明算法收敛。在LIBSVM数据库上的初步实验表明S2GD方法是可行的、有效的。
[1] Vapnik V.The Nature of Statistical Learning Theory[M].New York:Springer,2000 [2] 李航.统计学习方法[M].北京:清华大学出版社,2012 [3] Bottou L.Stochastic Learning[M]∥Advanced Lectures on Machine Learning.Springer Berlin Heidelberg,2004:146-168 [4] Bottou L.Large-scale machine learning with stochastic gradient descent[C]∥Proceedings of Computational Statistics 2010.Physica-Verlag HD,2010:177-186 [5] Zhang Tong.Solving large scale linear prediction problems using stochastic gradient descent algorithms[C]∥Proceedings of the 21st International Conference on Machine Learning.2004:919-926 [6] Robbins H,Monro S.A stochastic approximation method[J].The Annals of Mathematical Statistics,1951,22(3):400-407 [7] Powell W B.Approximate Dynamic Programming:Solving theCurses of Dimensionality[M].John Wiley & Sons,2007 [8] Johnson R,Zhang Tong.Accelerating stochastic gradient de-scent using predictive variance reduction[C]∥Advances in Neural Information Processing Systems.2013:315-323 [9] Konecˇny` J,Richtárik P.Semi-stochastic gradient descent methods.http://arxiv.org/pdf/1312.1666v2.pdf [10] Le Roux N,Schmidt M,Bach F.A stochastic gradient method with an exponential convergence rate for finite training [C]∥Advances in Neural Information Processing Systems.2012:2663-2671 [11] Blatt D,Hero A O,Gauchman H.A convergent incremental gradient method with a constant step size [J].SIAM Journal on Optimization,2007,18(1):29-51 [12] Bordes A,Bottou L,Gallinari P.SGD-QN:careful quasi-Newton stochastic gradient descent[J].Journal of Machine Learning Research,2009,10:1737-1754 [13] Mokhtari A,Ribeiro A.A dual stochastic DFP algorithm for optimal resource allocation in wireless systems[C]∥IEEE 14th Workshop on Signal Processing Advances in Wireless Communications.2013:21-25 [14] Mokhtari A.Ribeiro A.Regularized stochastic BFGS algorithm[C]∥IEEE Global Conference on Signal and Information Processing.2013:1109-1112 [15] Schraudolph N,Yu Jin,Günter S.A stochastic quasi-newtonmethod for online convex optimization [C]∥International Conference on Artificial Intelligence and Statistics.2007:436-443 [16] Sopya K,Drozda P.Stochastic gradient descent with Barzilai-Borwein update step for SVM purposes[J].Information Sciences,2015,316(20):218-233 [17] Sun Wen-yu,Yuan Ya-xiang.Optimization Theory and Metho-ds:Nonlinear Programming[M].Springer Science & Business Media,2006 [18] Barzilai J,Borwein J M.Two-point step size gradient methods[J].IMA Journal of Numerical Analysis,1988,8(1):141-148 [19] Biglari F,Solimanpur M.Scaling on the spectral gradient method[J].Journal of Optimization Theory and Applications,2013,158(2):626-635 [20] Farid M,Leong W J,Hassan M A.A new two-step gradient-type method for large-scale unconstrained optimization[J].Computers and Mathematics with Applications,2010,59(10):3301-3307 [21] Wright S J,Nowak R D,Figueiredo M A T.Sparse reconstruction by separable approximation[J].IEEE Transactions on Signal Processing,2009,7(7):2479-2493 [22] Yu Gao-hang,Guan Lu-tai,Chen Wu-fan.Spectral conjugategradient methods with sufficient descent property for large-scale unconstrained optimization[J].Optimization Methods and Software,2008,23(2):275-293 |
No related articles found! |
|