Computer Science ›› 2020, Vol. 47 ›› Issue (1): 110-116.doi: 10.11896/jsjkx.181001921

• Database & Big Data & Data Science • Previous Articles     Next Articles

Stacked Support Vector Machine Based on Attacks on Labels of Data Samples

JIN Yao1,XU Li-ya1,LV Hui-lin1,GU Su-hang2,3   

  1. (School of Information Science and Engineering,Changzhou University,Changzhou,Jiangsu 213164,China)1;
    (School of Digital Media,Jiangnan University,Wuxi,Jiangsu 214122,China)2;
    (College of Information Engineering and Technology,Changzhou Vocational Institute of Light Industry,Changzhou,Jiangsu 213164,China)3
  • Received:2018-10-15 Published:2020-01-19
  • About author:JIN Yao,born in 1971,master,associate professor.His main research interests include the computer application technology,and library and information scie-nce;GU Su-hang,born in 1989,doctoral student.His main research interests include artificial intelligence and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (81701793) and Science and Technology Project of Changzhou (CJ20190016).

Abstract: As for the adversarial data samples which indeed exist in real-world datasets,they can mislead data classifiers into correct predictions which results in poor classification.However,reasonable utilization of the adversarial data samples can distinctly improve the generalization of data classifiers.Since most of existing classifiers do not take the information about adversarial data samples into account to build corresponding classification models,a stacked support vector machine called S-SVM based on attacks on the labels of data samples which aims to obtain outperformed classification performance by learning the adversarial data samples was proposed.In a given dataset,a certain percentage of data samples are randomly chosen as adversarial data samples,in other words,the labels of these chosen data samples are substituted by the other labels included in the given dataset which are different from the original labels of the chosen data samples.Adversarial support vector machine (A-SVM) can be consequently generated by using the support vector machine (SVM) to train the given dataset which contains the adversarial data samples.The first-order gradient information on the output error of the generated A-SVM with respect to the input samples can be then computed,and the input samples will be updated by embedding the first-order gradient information into the original feature space of the input samples.Consequently,the updated data samples can be input into next A-SVM to be trained again to gradually improve the classification performance of the current A-SVM.As a result,S-SVM is formulated by stacking some A-SVMs layer by layer,the best classification results can also be obtained by the corresponding S-SVM.In terms of theoretical analysis and experimental results on UCI and KEEL real-world datasets,the mathematically computed first-order gradient information based on learning the adversarial data samples not only provide a positive relation between the outputs and the inputs of a classifier,but also indeed provide a novel way to stack the front and rear sub-classifiers in the proposed S-SVM.

Key words: Adversarial data samples, Attacks on labels, Stacked structure, Support vector machine (SVM)

CLC Number: 

  • TP391.4
[1]WAN Y,LI H H,WU K F,et al.Fusion with layered features of LBP and HOG for face recognition[J].Journal of Computer-Aided Design & Computer Graphics,2015,27(4):640-650.
[2]XI X F,ZHOU G D.A survey on deep learning for natural language processing[J].ACTA Automatica Sinica,2016,42(10):1445-1465.
[3]WANG D,MIAO D Q,WANG R Z.A new method of EEG classification with feature extraction based on wavelet packet decomposition[J].ACTA Electronica Sinica,2013,41(1):193-198.
[4]ZENG Z,WU C G,TANG Q H,et al.Classification of commodity image based on multi-feature fusion and depth learning[J].Computer Engineering and Design,2017,38(11):3093-3098.
[5]ZHOU T,CHUNG F-L,WANG S T.Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J].IEEE Transactions on Fuzzy Systems,2017,25(5):1207-1221.
[6]DONG A,CHUNG F L,DENG Z,et al.Semi-supervised SVM with extended hidden features[J].IEEE Transactions on Cybernetics,2016,46(12):2924-2937.
[7]HE X,ZHANG C,ZHANG L,et al.A optimal projection for image representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(5):1009-1015.
[8]MOSCA A,MAGOULAS G D.Hardening against adversarial examples with the smooth gradient method[J].Soft Computing,2018,22(10):3203-3213.
[9]KATHRIN G,NICOLAS P,PRAVEEN M,et al.Adversarial perturbations against deep neural networks for malware classification[J].arXiv:1511.04508.
[10]MA Y K,WU L F,JIAN M,et al.Approach to generate adversarial examples for face-spoofing detection[J].Journal of Software,2019,30(2):279-290.
[11]GU S X,RIGAZIO L.Towards deep neural network architectures robust to adversarial examples[C]∥International Confe-rence on Learning Representation(ICLR).Banff,Canada,2014.
[12]ZHOU T,CHUNG F L,WANG S T.Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J].IEEE Transactions on Fuzzy Systems,2017,25(5):1207-1221.
[13]ZHANG Y P,ISHIBUCHI H,WANG S T.Deep Takagi- Sugeno-Kang fuzzy classifier with shared linguistic fuzzy rules[J].IEEE Transactions on Fuzzy Systems,2018,26(3):1535-1549.
[14]VAPNIK V N.Statistical learning theory [M].New York:Wiley,1998.
[15]XU Y T.Maximum margin of twin spheres support vector machine for imbalanced data classification[J].IEEE Transactions on Cybernetics,2017,47(6):1540-1550.
[16]WANG Z R,WANG J,WANG Y R.An intelligent diagnosis scheme based on generative adversarial learning deep neural networks and its application to planetary gearbox fault pattern recognition[J].Neurocomputing,2018,310(8):213-222.
[17]OZSOY M,KHASAWNEH K N,DONOVICK C,et al.Hardware-based malware detection using low-level architectural features[J].IEEE Transactions on Computers,2016,65(11):3332-3344.
[18]TANG J J,LEU G,ABBASS H A.Networking the boids is more robust against adversarial learning[J].IEEE Transactions on Network Science and Engineering,2018,5(2):141-155.
[19]BURGES C J C.A tutorial on support vector machines for pattern recognition[J].Data Mining and Knowledge Discovery,1998,2(2):121-167.
[20]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[J].ACM Transactions on Intelligent Systems and Technology,2011,2(3):27:1-27:27.
[21]WANG Y S,XIA S T,TANG Q T,et al.A novel consistent random forest framework:Bernoulli random forests[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(8):3510-3523.
[22]OSHIRO T M,PEREZ P S,BARANAUSKAS J A.How many trees in a random forest?[C]∥International Conference on Machine Learning and Data Mining in Pattern Recognition.2012.
[23]QUINLAN J R.Induction of Decision Trees[J].Machine Lear- ning,1986,1(1):81-106.
[24]RUSSEL S,NORVIG P.Artificial intelligence:A modern ap- proach (2nd ed.)[M].Prentice Hall,2003:597.
[25]HINTON G E,OSINDERO S,TEH Y W.A faster learning algorithm for deep belief nets[J].Neural Computation,2006,1(7):1527-1544.
[26]SON N T,ARTUR S D,AVILA G.Deep logic networks:Inserting and extracting knowledge from deep belief networks[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(2):246-258.
[27]CHONG Z,PIN L,QIN A K,et al.Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics[J].IEEE Transactions on Neural Networks and Lear-ning Systems,2017,28(10):2306-2318.
[28]FRANK A,ASUNCION A.(2010) UCI Machine Learning Repository [OL].http://archive.ics.uci.edu/ml.
[29]ALCALÁ-FDEZ J,FERNÁNDEZ A,LUENGO J,et al.KEEL data-mining software tool:Data set repository,integration of algorithms and experimental analysis framework[J].Journal of Multiple-Valued Logic and Soft Computing,2011,17(2/3):255-287.
[30]ITO K,NAKANO R.Optimizing support vector regression hyper-parameters based on cross-validation[C]∥International Joint Conference on Neural Networks(IJCNN).Istanbul,Turkey,2003:2077-2082.
[31]DEMSAR J.Statistical comparisons of classifiers over multiple data sets[J].Journal of Machine Learning Research,2006,7:1-30.
[32]ZAR J H.Biostatistical Analysis(4th ed)[M].Prentice Hall,Englewood Clifs,New Jersey,1998.
[33]SHESKIN D J.Handbook of parametric and nonparametric statistical procedures[M].Chapman and Hall/CRC,2000.
[1] WANG Zhen-wu, SUN Jai-jun,  YU Zhong-yi and BU Yi-ya. Review of Remote Sensing Image Classification Based on Support Vector Machine [J]. Computer Science, 2016, 43(9): 11-17.
[2] YU Wen-yong, KANG Xiao-dong, GE Wen-jie and WANG Hao. Image Classification and Identification through SVM Based on Fuzzy Kernel Clustering [J]. Computer Science, 2015, 42(3): 307-310.
[3] SHENG Jia-chuan. Automatic Categorization of Traditional Chinese Paintings Based on Wavelet Transform [J]. Computer Science, 2014, 41(2): 317-319.
[4] . Method for Capture and Classification of New Intrusions [J]. Computer Science, 2012, 39(Z11): 45-50.
[5] . Intrusion Detection System Based on Choosing Characters and Weighting Characters [J]. Computer Science, 2012, 39(1): 89-91.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!