Computer Science ›› 2022, Vol. 49 ›› Issue (7): 64-72.doi: 10.11896/jsjkx.210500040

Parallel Support Vector Machine Algorithm Based on Clustering and WOA

LIU Wei-ming, AN Ran, MAO Yi-min   

  1. School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou,Jiangxi 341000,China
  • Received:2021-04-30 Revised:2021-05-17 Online:2022-07-15 Published:2022-07-12
  • About author:LIU Wei-ming,born in 1964,professor,master supervisor.His main research interests include data mining,big data and so on.
    MAO Yi-min,born in 1970,Ph.D,professor,master supervisor.Her main research interests include data mining,big data and so on.
  • Supported by:
    National Natural Science Foundation of China(41562019),National Key Research and Development of China(2018YFC1504705) and Science and Technology Foundation of Jiangxi Province(GJJ151528,GJJ151531).

Abstract: Aiming at the problems of parallel support vector machine(SVM) being sensitive to redundant data,poor parameter optimization ability and load imbalance in parallel process in the big data environment,a parallel support vector machine algorithm—MR-KWSVM,based on clustering algorithm and whale optimization algorithm,is proposed.Firstly,the algorithm proposes K-means and fisher(KF) strategy to delete redundant data,and trains SVM with the data set after the redundant data is deleted,which effectively reduces the sensitivity of SVM to redundant data.Secondly,the improved whale optimization algorithm based on nonlinear convergence factor and self-adaptive inertia weight(IW-BNAW) is proposed,and the IW-BNAW algorithm is used to obtain the SVM optimal parameters and improve the parameter optimization ability of the support vector machine.Finally,in the process of constructing parallel SVM with MapReduce,a time feedback strategy(TFB) is proposed for load scheduling of reduce nodes,which improves the parallel efficiency of the cluster and achieves high parallel SVM.Experiment results show that the proposed algorithm not only guarantees the high parallel computing power of SVM in big data environment,but also significantly improves the classification accuracy of SVM,and it has better generalization performance.

Key words: IW-BNAW algorithm, KF strategy, MapReduce frame, SVM algorithm, TFB strategy

  • TP338
