计算机科学 ›› 2015, Vol. 42 ›› Issue (9): 195-198.doi: 10.11896/j.issn.1002-137X.2015.09.037

• 人工智能 • 上一篇    下一篇

一种大规模支持向量机的高效求解算法

冯昌,李子达,廖士中   

  1. 天津大学计算机科学与技术学院 天津300072,天津大学计算机科学与技术学院 天津300072,天津大学计算机科学与技术学院 天津300072
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61170019)资助

Efficient Algorithm for Large-scale Support Vector Machine

FENG Chang, LI Zi-da and LIAO Shi-zhong   

  • Online:2018-11-14 Published:2018-11-14

摘要: 现有大规模支持向量机求解算法需要大量的内存资源和训练时间,通常在大集群并行环境下才能实现。提出了一种大规模支持向量机(SVM)的高效求解算法,以在个人PC机求解大规模SVM。它包括3个步骤:首先对大规模样本进行子采样来降低数据规模;然后应用随机傅里叶映射显式地构造随机特征空间,使得可在该随机特征空间中应用线性SVM来一致逼近高斯核SVM;最后给出线性SVM在多核环境下的并行实现方法以进一步提高求解效率。标准数据集的对比实验验证了该求解算法的可行性与高效性。

关键词: 大规模支持向量机,子采样,随机傅里叶特征,并行线性支持向量机

Abstract: The algorithm for solving large-scale support vector machine(SVM) needs large memory requirement and computation time.Therefore,large-scale SVMs are performed on computer clusters or supercomputers.An efficient algorithm for large-scale SVM was presented,which can be operated on a daily-life PC.First,the large-scale training examples were subsampled to reduce the data size.Then,the random Fourier mapping was explicitly applied to the subsample to generate the random feature space,making it possible to apply a linear SVM to uniformly approximate to the Gaussian kernel SVM.Finally,a parallelized linear SVM algorithm was implemented to speed up the training further.Experimental results on benchmark datasets demonstrate the feasibility and efficiency of the proposed algorithm.

Key words: Large-scale support vector machine,Subsampling,Random Fourier features,Parallelized linear SVM

[1] Vapnik V.The Nature of Statistical Learning Theory[M].New York:Springer Verlag,2000
[2] 文益民,王耀南,吕宝粮,等.支持向量机处理大规模问题算法综述[J].计算机科学,2009,36(7):20-25 Wen Yi-min,Wang Yao-nan,Lv Bao-liang,et al.Survey of Applying Support Vector Machines to Handle Large-scale Problems[J].Computer Science,2009,36(7):20-25
[3] Platt J C.Fast training of support vector machines using sequential minimal optimization[M]∥Schlkopf B,Burges C,Smola A.Advances in Kernel Methods:Support Vector Learning.Cambridge:MIT Press,1999:185-208
[4] Tsang L W,Kwok J,Cheung P M.Core vector machines:Fast SVM training on very large data sets[J].Journal of Machine Learning Research,2005,6(2):363-392
[5] Chang E Y,Zhu Kai-hua,Wang Hao,et al.Parallelizing support vector machines on distributed computers[M]∥ Platt J C,Koller D,Singer Y,et al.,eds.Advances in Neural Information Processing Systems 20.Cambridge:MIT Press,2008:257-264
[6] Zhang Tong.Solving large scale linear prediction problems using stochastic gradient descent algorithms[C]∥Proceedings of the 21st International Conference on Machine Learning,2004.New York,USA,2004:919-926
[7] Fan R E,Chang K W,Hsieh C J.LIBLINEAR:A library for large linear classification[J].Journal of Machine Learning Research,2008,9(12):1871-1874
[8] Rahimi A,Recht B.Random features for large-scale kernel machines[M]∥Platt J C,Koller D,Singer Y,et al.,eds.Advances in Neural Information Processing Systems 20.Cambridge:MIT Press,2008:1177-1184
[9] Boyd S,Parikh N,Chu E,et al.Distributed optimization and statistical learning via the alternating direction method of multip-liers[J].Foundations and Trends in Machine Learning,2011,3(1):1-122
[10] Chang Y W,Hsieh C J,Chang K W,et al.Training and testing low-degree polynomial data mappings via linear SVM[J].Journal of Machine Learning Research,2010,11(4):1471-1490
[11] Zhang K,Lan L,Wang Z,et al.Scaling up kernel SVM on limi-ted resources:A low-rank linearization approach[C]∥Procee-dings of the 15th International Conference on Artificial Intelligence and Statistics,2012.Canary,Spain,2012:1425-1434

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!