计算机科学 ›› 2016, Vol. 43 ›› Issue (12): 97-100.doi: 10.11896/j.issn.1002-137X.2016.12.017
曹路
CAO Lu
摘要: 传统的支持向量机在处理不平衡数据时效果不佳。为了提高少类样本的识别精度,提出了一种基于支持向量的上采样方法。首先根据K近邻的思想清除原始数据集中的噪声;然后用支持向量机对训练集进行学习以获得支持向量,进一步对少类样本的每一个支持向量添加服从一定规律的噪声,增加少数类样本的数目以获得相对平衡的数据集;最后将获得的新数据集用支持向量机学习。实验结果显示,该方法在人工数据集和UCI标准数据集上均是有效的。
[1] Vapnik V N .统计学习理论[M].许建华,张学工,译.北京:电子工业出版社,2004 [2] Castro C L,Braga A P.Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data[J].IEEE Transactions on Neural Networks & Learning Systems,2013,24(6):888-899 [3] Li Y,Liu Z D,Zhang H J.Review on Ensemble Algorithms for Imbalanced Data Classification[J].Application Research of Computers,2014,31(5):1288-1291(in Chinese) 李勇,刘战东,张海军.不平衡数据的集分类方法综述[J].计算机应用研究,2014,31(5):1288-1291 [4] Galar M,FernaNdez A,Barrenechea E,et al.A Review on Ensembles for the Class Imbalance Problem:Bagging-,Boosting-,and Hybrid-Based Approaches[J].IEEE Transactions on Systems Man & Cybernetics Part C,2012,42(4):463-484 [5] Bishop C.Training with Noise is Equivalent to Tikhonov Regularization[J].Neural Computation,1995,7(1):108-116 [6] Yang J,Yu X,Xie Z Q,et al.A novel virtual sample generation method based on Gaussian distribution[J].Knowledge-Based Systems,2011,24(6):740-748 [7] He H,Garcia E A.Learning from Imbalanced Data[J].IEEETransactions on Knowledge & Data Engineering,2009,21(9):1263-1284 [8] Chawla N V,Bowyer K W,Hall L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2011,16(1):321-357 [9] Han H,Wang W Y,Mao B H.Borderline-SMOTE:A NewOver-Sampling Method in Imbalanced Data Sets Learning[M]∥Advances in Intelligent Computing.Springer Berlin Heidelberg,2005:878-887 [10] Gao M,Hong X,Chen S,et al.PDFOS:PDF estimation basedover-sampling for imbalanced two-class problems[J].Neurocomputing,2012,138(11):1-8 [11] Das B,Krishnan N C,Cook D J.RACOG and wRACOG:Two Probabilistic Oversampling Techniques[J].IEEE Transactions on Knowledge & Data Engineering,2015,27(1):222-234 [12] Abdi L,Hashemi S.To combat multi-class imbalanced problems by means of over-sampling and boosting techniques[J].Soft Computing,2014,19(12):3369-3385 [13] Kubat M,Matwin S.Addressing the Curse of Imbalanced Trai-ning Sets:One-Sided Selection[C]∥Proceedings of the Fourteenth International Conference on Machine Learning.2000:179-186 [14] Yen S J,Lee Y S.Cluster-based under-sampling approaches for imbalanced data distributions[J].Expert Systems with Applications,2009,36(3):5718-5727 [15] Lin M,Tang K,Yao X.Dynamic sampling approach to training neural networks for multiclass imbalance classification[J].IEEE Transactions on Neural Networks & Learning Systems,2013,24(4):647-660 [16] Fan Q,Wang Z,Gao D.One-sided Dynamic Undersampling No-Propagation Neural Networks for imbalance problem[J].Engineering Applications of Artificial Intelligence,2016,53(c):62-73 [17] Ng W W,Hu J,Yeung D S,et al.Diversified Sensitivity-Based Undersampling for Imbalance Classification Problems[J].IEEE Transactions on Cybernetics,2014,45(11):2402-2412 [18] Zhang X S,Luo Q.Unbalanced Data Classification AlgorithmBased on Clustering Ensemble Under-sampling [J].Computer Science,2015,42(11):63-66(in Chinese) 张枭山,罗强.一种基于聚类融合欠抽样的不平衡数据分类方法[J].计算机科学,2015,42(11):63-66 [19] Cao L,Wang P.Imbalanced Data Classification Based on SMOTESampling and the Support Vector Machine [J].Journal of wuyi university(Natural Science Edition), 2015,29(4):27-31(in Chinese) 曹路,王鹏.基于SMOTE采样和支持向量机的不平衡数据分类[J].五邑大学学报(自然科学版),2015,29(4):27-31 |
No related articles found! |
|