攻击标签信息的堆栈式支持向量机

doi:10.11896/jsjkx.181001921

摘要/Abstract

摘要： 真实数据集中存在的对抗样本易导致分类器取得较差的分类性能,但如果其能够被合理利用,分类器的泛化能力将得到显著提高。针对现有大部分分类器并没有涉及对抗样本信息的问题,提出一种攻击标签信息的堆栈式支持向量机。该方法从给定的初始数据集中选取一定比例的样本,并攻击所选取样本的标签,使之成为对抗样本,即将样本标签替换成其他不同类型的标签,利用支持向量机训练包含对抗样本的数据集,从而生成对抗支持向量机。计算对抗支持向量机的输出误差相对于输入样本的一阶梯度信息,并将其嵌入到输入样本特征中以更新输入样本。将更新后的样本输入到下一个对抗支持向量机中,并重新训练。以堆栈方式级联一定数目的对抗支持向量机,直至取得最好的分类性能。原理分析与实验结果表明,基于对抗样本的一阶梯度信息不仅提供了分类器输出与输入之间的一种正相关关系,而且为堆栈式支持向量机中的子分类器提供了一种新的堆栈方式,并提高了分类器的整体性能。

关键词: 标签攻击, 堆栈结构, 对抗样本, 支持向量机

Abstract: As for the adversarial data samples which indeed exist in real-world datasets,they can mislead data classifiers into correct predictions which results in poor classification.However,reasonable utilization of the adversarial data samples can distinctly improve the generalization of data classifiers.Since most of existing classifiers do not take the information about adversarial data samples into account to build corresponding classification models,a stacked support vector machine called S-SVM based on attacks on the labels of data samples which aims to obtain outperformed classification performance by learning the adversarial data samples was proposed.In a given dataset,a certain percentage of data samples are randomly chosen as adversarial data samples,in other words,the labels of these chosen data samples are substituted by the other labels included in the given dataset which are different from the original labels of the chosen data samples.Adversarial support vector machine (A-SVM) can be consequently generated by using the support vector machine (SVM) to train the given dataset which contains the adversarial data samples.The first-order gradient information on the output error of the generated A-SVM with respect to the input samples can be then computed,and the input samples will be updated by embedding the first-order gradient information into the original feature space of the input samples.Consequently,the updated data samples can be input into next A-SVM to be trained again to gradually improve the classification performance of the current A-SVM.As a result,S-SVM is formulated by stacking some A-SVMs layer by layer,the best classification results can also be obtained by the corresponding S-SVM.In terms of theoretical analysis and experimental results on UCI and KEEL real-world datasets,the mathematically computed first-order gradient information based on learning the adversarial data samples not only provide a positive relation between the outputs and the inputs of a classifier,but also indeed provide a novel way to stack the front and rear sub-classifiers in the proposed S-SVM.

Key words: Adversarial data samples, Attacks on labels, Stacked structure, Support vector machine (SVM)

中图分类号:

TP391.4

金耀,徐丽亚,吕慧琳,顾苏杭. 攻击标签信息的堆栈式支持向量机[J]. 计算机科学, 2020, 47(1): 110-116. https://doi.org/10.11896/jsjkx.181001921

JIN Yao,XU Li-ya,LV Hui-lin,GU Su-hang. Stacked Support Vector Machine Based on Attacks on Labels of Data Samples[J]. Computer Science, 2020, 47(1): 110-116. https://doi.org/10.11896/jsjkx.181001921

参考文献

[1]WAN Y,LI H H,WU K F,et al.Fusion with layered features of LBP and HOG for face recognition[J].Journal of Computer-Aided Design & Computer Graphics,2015,27(4):640-650.
[2]XI X F,ZHOU G D.A survey on deep learning for natural language processing[J].ACTA Automatica Sinica,2016,42(10):1445-1465.
[3]WANG D,MIAO D Q,WANG R Z.A new method of EEG classification with feature extraction based on wavelet packet decomposition[J].ACTA Electronica Sinica,2013,41(1):193-198.
[4]ZENG Z,WU C G,TANG Q H,et al.Classification of commodity image based on multi-feature fusion and depth learning[J].Computer Engineering and Design,2017,38(11):3093-3098.
[5]ZHOU T,CHUNG F-L,WANG S T.Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J].IEEE Transactions on Fuzzy Systems,2017,25(5):1207-1221.
[6]DONG A,CHUNG F L,DENG Z,et al.Semi-supervised SVM with extended hidden features[J].IEEE Transactions on Cybernetics,2016,46(12):2924-2937.
[7]HE X,ZHANG C,ZHANG L,et al.A optimal projection for image representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(5):1009-1015.
[8]MOSCA A,MAGOULAS G D.Hardening against adversarial examples with the smooth gradient method[J].Soft Computing,2018,22(10):3203-3213.
[9]KATHRIN G,NICOLAS P,PRAVEEN M,et al.Adversarial perturbations against deep neural networks for malware classification[J].arXiv:1511.04508.
[10]MA Y K,WU L F,JIAN M,et al.Approach to generate adversarial examples for face-spoofing detection[J].Journal of Software,2019,30(2):279-290.
[11]GU S X,RIGAZIO L.Towards deep neural network architectures robust to adversarial examples[C]∥International Confe-rence on Learning Representation(ICLR).Banff,Canada,2014.
[12]ZHOU T,CHUNG F L,WANG S T.Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J].IEEE Transactions on Fuzzy Systems,2017,25(5):1207-1221.
[13]ZHANG Y P,ISHIBUCHI H,WANG S T.Deep Takagi- Sugeno-Kang fuzzy classifier with shared linguistic fuzzy rules[J].IEEE Transactions on Fuzzy Systems,2018,26(3):1535-1549.
[14]VAPNIK V N.Statistical learning theory [M].New York:Wiley,1998.
[15]XU Y T.Maximum margin of twin spheres support vector machine for imbalanced data classification[J].IEEE Transactions on Cybernetics,2017,47(6):1540-1550.
[16]WANG Z R,WANG J,WANG Y R.An intelligent diagnosis scheme based on generative adversarial learning deep neural networks and its application to planetary gearbox fault pattern recognition[J].Neurocomputing,2018,310(8):213-222.
[17]OZSOY M,KHASAWNEH K N,DONOVICK C,et al.Hardware-based malware detection using low-level architectural features[J].IEEE Transactions on Computers,2016,65(11):3332-3344.
[18]TANG J J,LEU G,ABBASS H A.Networking the boids is more robust against adversarial learning[J].IEEE Transactions on Network Science and Engineering,2018,5(2):141-155.
[19]BURGES C J C.A tutorial on support vector machines for pattern recognition[J].Data Mining and Knowledge Discovery,1998,2(2):121-167.
[20]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[J].ACM Transactions on Intelligent Systems and Technology,2011,2(3):27:1-27:27.
[21]WANG Y S,XIA S T,TANG Q T,et al.A novel consistent random forest framework:Bernoulli random forests[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(8):3510-3523.
[22]OSHIRO T M,PEREZ P S,BARANAUSKAS J A.How many trees in a random forest?[C]∥International Conference on Machine Learning and Data Mining in Pattern Recognition.2012.
[23]QUINLAN J R.Induction of Decision Trees[J].Machine Lear- ning,1986,1(1):81-106.
[24]RUSSEL S,NORVIG P.Artificial intelligence:A modern ap- proach (2nd ed.)[M].Prentice Hall,2003:597.
[25]HINTON G E,OSINDERO S,TEH Y W.A faster learning algorithm for deep belief nets[J].Neural Computation,2006,1(7):1527-1544.
[26]SON N T,ARTUR S D,AVILA G.Deep logic networks:Inserting and extracting knowledge from deep belief networks[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(2):246-258.
[27]CHONG Z,PIN L,QIN A K,et al.Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics[J].IEEE Transactions on Neural Networks and Lear-ning Systems,2017,28(10):2306-2318.
[28]FRANK A,ASUNCION A.(2010) UCI Machine Learning Repository [OL].http://archive.ics.uci.edu/ml.
[29]ALCALÁ-FDEZ J,FERNÁNDEZ A,LUENGO J,et al.KEEL data-mining software tool:Data set repository,integration of algorithms and experimental analysis framework[J].Journal of Multiple-Valued Logic and Soft Computing,2011,17(2/3):255-287.
[30]ITO K,NAKANO R.Optimizing support vector regression hyper-parameters based on cross-validation[C]∥International Joint Conference on Neural Networks(IJCNN).Istanbul,Turkey,2003:2077-2082.
[31]DEMSAR J.Statistical comparisons of classifiers over multiple data sets[J].Journal of Machine Learning Research,2006,7:1-30.
[32]ZAR J H.Biostatistical Analysis(4th ed)[M].Prentice Hall,Englewood Clifs,New Jersey,1998.
[33]SHESKIN D J.Handbook of parametric and nonparametric statistical procedures[M].Chapman and Hall/CRC,2000.

相关文章 15

[1]	侯夏晔, 陈海燕, 张兵, 袁立罡, 贾亦真. 一种基于支持向量机的主动度量学习算法 Active Metric Learning Based on Support Vector Machines 计算机科学, 2022, 49(6A): 113-118. https://doi.org/10.11896/jsjkx.210500034
[2]	吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039
[3]	单晓英, 任迎春. 基于改进麻雀搜索优化支持向量机的渔船捕捞方式识别 Fishing Type Identification of Marine Fishing Vessels Based on Support Vector Machine Optimized by Improved Sparrow Search Algorithm 计算机科学, 2022, 49(6A): 211-216. https://doi.org/10.11896/jsjkx.220300216
[4]	陈景年. 一种适于多分类问题的支持向量机加速方法 Acceleration of SVM for Multi-class Classification 计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149
[5]	邢云冰, 龙广玉, 胡春雨, 忽丽莎. 基于SVM的类别增量人体活动识别方法 Human Activity Recognition Method Based on Class Increment SVM 计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024
[6]	武玉坤, 李伟, 倪敏雅, 许志骋. 单类支持向量机融合深度自编码器的异常检测模型 Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder 计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[7]	李建, 郭延明, 于天元, 武与伦, 王翔汉, 老松杨. 基于生成对抗网络的多目标类别对抗样本生成算法 Multi-target Category Adversarial Example Generating Algorithm Based on GAN 计算机科学, 2022, 49(2): 83-91. https://doi.org/10.11896/jsjkx.210800130
[8]	陈梦轩, 张振永, 纪守领, 魏贵义, 邵俊. 图像对抗样本研究综述 Survey of Research Progress on Adversarial Examples in Images 计算机科学, 2022, 49(2): 92-106. https://doi.org/10.11896/jsjkx.210800087
[9]	王超, 魏祥麟, 田青, 焦翔, 魏楠, 段强. 基于特征梯度的调制识别深度网络对抗攻击方法 Feature Gradient-based Adversarial Attack on Modulation Recognition-oriented Deep Neural Networks 计算机科学, 2021, 48(7): 25-32. https://doi.org/10.11896/jsjkx.210300299
[10]	侯春萍, 赵春月, 王致芃. 基于自反馈最优子类挖掘的视频异常检测算法 Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining 计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146
[11]	王丹妮, 陈伟, 羊洋, 宋爽. 基于高斯增强和迭代攻击的对抗训练防御方法 Defense Method of Adversarial Training Based on Gaussian Enhancement and Iterative Attack 计算机科学, 2021, 48(6A): 509-513. https://doi.org/10.11896/jsjkx.200800081
[12]	郭福民, 张华, 胡瑢华, 宋岩. 一种基于表面肌电信号的腕部肌力估计方法研究 Study on Method for Estimating Wrist Muscle Force Based on Surface EMG Signals 计算机科学, 2021, 48(6A): 317-320. https://doi.org/10.11896/jsjkx.200600021
[13]	卓雅倩, 欧博. 噪声环境下的人脸防伪识别算法研究 Face Anti-spoofing Algorithm for Noisy Environment 计算机科学, 2021, 48(6A): 443-447. https://doi.org/10.11896/jsjkx.200900207
[14]	雷剑梅, 曾令秋, 牟洁, 陈立东, 王淙, 柴勇. 基于整车EMC标准测试和机器学习的反向诊断方法 Reverse Diagnostic Method Based on Vehicle EMC Standard Test and Machine Learning 计算机科学, 2021, 48(6): 190-195. https://doi.org/10.11896/jsjkx.200700204
[15]	陈凯, 魏志鹏, 陈静静, 姜育刚. 多媒体模型对抗攻防综述 Adversarial Attacks and Defenses on Multimedia Models:A Survey 计算机科学, 2021, 48(3): 27-39. https://doi.org/10.11896/jsjkx.210100079

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed