一种并行结构化支持向量机次梯度投影算法

计算机科学 ›› 2014, Vol. 41 ›› Issue (2): 127-130.

一种并行结构化支持向量机次梯度投影算法

郭丽娜,杨明,涂金金

南京师范大学计算机科学与技术学院南京210046;南京师范大学计算机科学与技术学院南京210046;南京师范大学计算机科学与技术学院南京210046

出版日期:2018-11-14 发布日期:2018-11-14
基金资助:
本文受国家自然科学基金(61272222,61003116),江苏省自然科学基金重点重大专项(BK2011005),江苏省自然科学基金(BK2011782),江苏省普通高校研究生科研创新计划项目(CXLX12_0415)资助

Parallel Primal Estimated Sub-GrAdient Solver for Structural SVM

GUO Li-na,YANG Ming and TU Jin-jin

Online:2018-11-14 Published:2018-11-14

摘要/Abstract

摘要： 支持向量机的次梯度投影算法是解决支持向量机优化求解问题的一种简单有效的迭代算法。该算法通过梯度下降和投影两个步骤的多轮迭代,找到两类最大间隔的分类面。针对该算法忽略了对寻找分类面同样有指导意义的样本分布信息这一问题,在分类器设计中融入结构信息,并且采用MapReduce并行计算框架,提出了一种并行结构化支持向量机的次梯度投影算法,该算法能够充分利用集群的计算和存储能力,适用于海量数据的优化问题。在NASA的两个软件模块缺陷度量数据集CM1和PC1上的实验结果表明,该算法能够加快收敛速度,提高分类性能,有效地解决海量数据的优化求解问题。

关键词: 结构化支持向量机,并行,MapReduce 中图法分类号TP181文献标识码A

Abstract: Primal estimated sub-GrAdient solver for SVM (Pegasos) is a simple and effective iterative algorithm for solving the optimization problem of Support Vector Machine．The method alternates between stochastic gradient descent steps and projection steps to find a hyper-plane that can separate two classes of samples with the maximal margin．But it neglects the data distributions which are also vital for an optimal classifier.We developed a novel algorithm,termed as Parallel Primal Estimated sub-GrAdient Solver for Structural SVM (PSPegasos) by embedding the structural information into the SVM and using the parallel computing framework:MapReduce．This algorithm can take full advantage of the computing and storage capacity of the computer cluster,and be applicable to the optimization problem of the massive data．The algorithm was used to two NASA software module datasets CM1and PC1,and the experimental results show that the algorithm can accelerate the convergence speed,improve the classification performance and be an effective solution to the optimization problem of the massive data.

Key words: Structural support vector machine,Parallel,MapReduce

郭丽娜,杨明,涂金金. 一种并行结构化支持向量机次梯度投影算法[J]. 计算机科学, 2014, 41(2): 127-130. https://doi.org/

GUO Li-na,YANG Ming and TU Jin-jin. Parallel Primal Estimated Sub-GrAdient Solver for Structural SVM[J]. Computer Science, 2014, 41(2): 127-130. https://doi.org/

参考文献

[1] Cortes C,Vapnik V．Support vector networks[J]．MachineLearning,1995,20(2):273-295
[2] Collobert R,Bengio S,Bengio Y．A Parallel Mixture of SVMsfor Very Large Scale Problems[J]．Neural computation,2002,14(5):1105-1114
[3] Graf H P,Cosatto E,Bottou L,et al.Parallel Support VectorMachines[C]∥The Cascade SVM:Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems,2005．Vancouver,Canada:MIT Press,2005:521-528 (下转第135页)(上接第130页)
[4] Dean J,Ghemawat S．MapReduce:simplified data processing onlarge clusters[J]．Communications of the ACM,2008,51(1):107-113
[5] Chang E Y．Foundations of Large-Scale Multimedia Information Management and Retrieval[M]．Berlin,Heidelberg:Springer,2011:213-230
[6] Zhao J,Liang Z,Yang Y．Parallelized incremental support vector machines based on MapReduce and Bagging technique[C]∥Proceedings of the International Conference on Information Science and Technology,2012．New York:IEEE,2012:297-301
[7] Zhao Z．Parallel Pegasos for Mahout．http://wenku.baidu.com/view/0e8d6d639b6648d7c1c74661.html,2012
[8] Lanckriet G R,Ghaoui L E,Bhattacharyya C,et al．A RobustMinimax Approach to Classification[J]．Journal of Machine Learning Research,2002,3(3):555-582
[9] Huang K,Yang H,King I,et al．Learning large margin classi-fiers locally and globally[C]∥Proceedings of the twenty-first international conference on Machine learning,2004．New York:ACM,2004:51
[10] Yeung D S,Wang D,Ng W W,et al．Structured Large Margin Machines:Sensitive to Data Distributions[J]．Machine Lear-ning,2007,68:171-200
[11] Xue H,Chen S,Yang Q．Structural support vector machine[C]∥Proceedings of the fifth International Symposium on Neural Networks,2008．Berlin,Heidelberg:Springer,2008:501-511
[12] Shalev-Shwartz S,Singer Y,Srebro N．Pegasos:Primal Esti-mated sub-GrAdient SOlver for SVM[C]∥Proceedings of the 24th International Conference on Machine Learning,2007．New York:ACM,2007:807-814

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed