深度神经网络的对抗攻击及防御方法综述

doi:10.11896/jsjkx.210900163

摘要/Abstract

摘要： 深度神经网络正在引领人工智能新一轮的发展高潮,在多个领域取得了令人瞩目的成就。然而,有研究指出深度神经网络容易遭受对抗攻击的影响,导致深度神经网络输出错误的结果,其安全性引起了人们极大的关注。文中从深度神经网络安全性的角度综述了对抗攻击与防御方法的研究现状。首先,围绕深度神经网络的对抗攻击问题简述了相关概念及存在性解释;其次,从基于梯度的对抗攻击、基于优化的对抗攻击、基于迁移的对抗攻击、基于GAN的对抗攻击和基于决策边界的对抗攻击的角度介绍了对抗攻击方法,分析每种攻击方法的特点;再次,从基于数据预处理、增强深度神经网络模型的鲁棒性和检测对抗样本等3个方面阐述了对抗攻击的防御方法;然后,从语义分割、音频、文本识别、目标检测、人脸识别、强化学习等领域列举了对抗攻击与防御的实例;最后,通过对对抗攻击与防御方法的分析,展望了深度神经网络中对抗攻击和防御的发展趋势。

关键词: 人工智能, 深度神经网络, 神经网络安全, 对抗攻击, 防御方法

Abstract: Deep neural networks are leading a new round of high tide of artificial intelligence development,and have made remar-kable achievements in many fields.However,recent studies have pointed out that deep neural networks are vulnerable to adversa-rial attacks,resulting in incorrect network outputs,and their security has attracted great attention.This paper summarizes the current state of research on adversarial attacks and defense methods from the perspective of deep neural network security.Firstly,it briefly describes the related concepts and existence explanations around the adversarial attacks of deep neural networks.Secondly,it introduces adversarial attacks from the perspectives of gradient-based adversarial attacks,optimization-based adversarial attacks,migration-based adversarial attacks,GAN-based adversarial attacks and decision boundary-based adversarial attacks,and analyses the characteristics of each adversarial attack method,analyzing the characteristics of each attack method.Again,the defense methods of adversarial attacks are explained from three aspects,including data-based pre-processing,enhancing the robustness of deep neural network models and detecting adversarial samples.Then,from the fields of semantic segmentation,audio,text recognition,target detection,face recognition,reinforcement learning,examples of adversarial attacks and defenses are listed.Finally,the development trend of adversarial attacks and defenses in deep neural networks is forcasted through the analysis of adversarial attacks and defense methods.

Key words: Artificial intelligence, Deep neural network, Neural network security, Adversarial attacks, Defense methods

中图分类号:

TP391

赵宏, 常有康, 王伟杰. 深度神经网络的对抗攻击及防御方法综述[J]. 计算机科学, 2022, 49(11A): 210900163-11. https://doi.org/10.11896/jsjkx.210900163

ZHAO Hong, CHANG You-kang, WANG Wei-jie. Survey of Adversarial Attacks and Defense Methods for Deep Neural Networks[J]. Computer Science, 2022, 49(11A): 210900163-11. https://doi.org/10.11896/jsjkx.210900163

参考文献

[1]TIAN Y,PEI K,JANA S,et al.Deeptest:Automated testing of deep-neural-network-driven autonomous cars[C]//Proceedings of the 40th International Conference on Software Engineering.New York,2018:303-314.
[2]CHEN C,SEFF A,KORNHAUSER A,et al.Deepdriving:Learning affordance for direct perception in autonomous driving[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway,2015:2722-2730.
[3]LITJENS G,KOOI T,BEJNORDI B E,et al.A survey on deep learning in medical image analysis[J].Medical Image Analysis,2017,42:60-88.
[4]SHEN D,WU G,SUK H I.Deep learning in medical image ana-lysis[J].Annual Review of Biomedical Engineering,2017,19:221-248.
[5]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscata-way,2016:770-778.
[6]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely con-nected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2017:4700-4708.
[7]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[OL].(2015-04-10) [2021-08-12].https://arxiv.org/pdf/1409.1556.pdf.
[8]TANG T A,MHAMDI L,MCLERNON D,et al.Deep learning approach for network intrusion detection in software defined networking[C]//2016 International Conference on Wireless Networks and Mobile Communications(WINCOM).Washington,2016:258-263.
[9]YUFEI C,CHAO S,QIAN W,et al.Security and Privacy Risks in Artificial Intelligence Systems[J].Journal of Computer Research and Development,2019,56(10):2135.
[10]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[OL].(2014-02-19)[2021-08-12].https://arxiv.org/pdf/1312.6199.pdf.
[11]GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[OL].(2015-02-25) [2021-08-12].https://arxiv.org/pdf/1412.6572.pdf.
[12]KURAKIN A,GOODFELLOW I,BENGIO S.Adversarial examples in the physical world[OL].(2017-02-11)[2021-08-12].https://arxiv.org/pdf/1607.02533.pdf.
[13]JOSHI A,MUKHERJEE A,SARKAR S,et al.Semantic adversarial attacks:Parametric transformations that fool deep classifiers[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway,2019:4773-4783.
[14]FAN Y,WU B,LI T,et al.Sparse adversarial attack via perturbation factorization[C]//Proceedings of European Conference on Computer Vision.Cham,2020:.
[15]GUO M,YANG Y,XU R,et al.When NAS Meets Robustness:In Search of Robust Architectures against Adversarial Attacks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway,2020:631-640.
[16]ZHANG H,WANG J.Defense against adversarial attacks using feature scattering-based adversarial training[C]//Advances in Neural Information Processing Systems.Vancouver,2019:1831-1841.
[17]PANG T,DU C,ZHU J.Robust deep learning via reverse cross-entropy training and thresholding test[OL].(2018-11-07) [2021-08-12].https://arxiv.org/pdf/1706.00633.pdf.2021.
[18]METZEN J H,GENEWEIN T,FISCHER V,et al.On detecting adversarial perturbations[OL].(2017-02-21) [2021-08-12].https://arxiv.org/pdf/1702.04267.pdf,2021.
[19]CARLINI N,WAGNER D.Towards evaluating the robustness of neural networks[C]//2017 IEEE Symposium on Security and Privacy(SP).Piscataway,2017:39-57.
[20]XIE C,WU Y,MAATEN L V D,et al.Feature denoising for improving adversarial robustness[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2019:501-509.
[21]LIAO F,LIANG M,DONG Y,et al.Defense against adversarial attacks using high-level representation guided denoiser[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2018:1778-1787.
[22]DAS N,SHANBHOGUE M,CHEN S-T,et al.Keeping the bad guys out:Protecting and vaccinating deep learning with jpeg compression[OL].(2017-05-08)[2021-08-12].https://arxiv.org/pdf/1705.02900.pdf.2021.8.
[23]DZIUGAITE G K,GHAHRAMANI Z,ROY D M.A study of the effect of jpg compression on adversarial images[OL].(2016-08-02)[2021-08-12].https://arxiv.org/pdf/1608.00853.pdf.
[24]TRAMÈR F,KURAKIN A,PAPERNOT N,et al.Ensembleadversarial training:Attacks and defenses[OL].(2020-04-26) [2021-08-12].https://arxiv.org/pdf/1705.07204.pdf.
[25]KURAKIN A,GOODFELLOW I,BENGIO S.Adversarial machine learning at scale[OL].(2017-02-11) [2021-08-12].https://arxiv.org/pdf/1611.01236.pdf.
[26]PAPERNOT N,MCDANIEL P.On the effectiveness of defensive distillation[OL].(2016-07-18) [2021-08-12].https://arxiv.org/pdf/1607.05113.pdf.
[27]NAYEBI A,GANGULI S.Biologically inspired protection ofdeep networks from adversarial attacks[OL].(2017-03-27)[2021-08-12].https://arxiv.org/pdf/1703.09202.pdf.
[28]GU S,RIGAZIO L.Towards deep neural network architectures robust to adversarial examples[OL].(2015-04-09) [2021-08-12].https://arxiv.org/pdf/1412.5068.pdf.
[29]SAMANGOUEI P,KABKAB M,CHELLAPPA R.Defense-gan:Protecting classifiers against adversarial attacks using generative models[OL].(2015-05-18) [2021-08-12].https://arxiv.org/pdf/1805.06605.pdf.
[30]MENG D,CHEN H.Magnet:a two-pronged defense against adversarial examples[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.New York,2017:135-147.
[31]TANAY T,GRIFFIN L.A boundary tilting persepective on the phenomenon of adversarial examples[OL].(2016-08-27)[2021-08-12].https://arxiv.org/pdf/1608.07690.pdf.
[32]GILMER J,METZ L,FAGHRI F,et al.Adversarial spheres[OL].(2018-09-10)[2021-08-12].https://arxiv.org/pdf/1801.02774.pdf.
[33]MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[OL].(2019-09-04) [2021-08-12].https://arxiv.org/pdf/1706.06083.pdf.
[34]PAPERNOT N,MCDANIEL P,JHA S,et al.The limitations of deep learning in adversarial settings[C]//2016 IEEE European Symposium on Security and Privacy(EuroS&P).IEEE,Piscata-way,2016:372-387.
[35]LECUN Y.The MNIST database of handwritten digits[OL].[2021-08-12].http://yann.lecun.com/exdb/mnist/.
[36]DONG Y,LIAO F,PANG T,et al.Boosting adversarial attacks with momentum[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.Piscataway,2018:9185-9193.
[37]XIE C,ZHANG Z,ZHOU Y,et al.Improving transferability of adversarial examples with input diversity[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2019:2730-2739.
[38]DOLATABADI H M,ERFANI S,LECKIE C.AdvFlow:Inconspicuous Black-box Adversarial Attacks using Normalizing Flows[OL].(2020-10-23) [2021-08-12].https://arxiv.org/pdf/2007.07435.pdf.2021.8.
[39]KAMATH S,DESHPANDE A,SUBRAHMANYAM K.Universalization of any adversarial attack using very few test examples[OL].(2020-05-18)[2021-08-12].https://arxiv.org/pdf/2005.08632.pdf.
[40]MA C,CHEN L,YONG J H.Simulating Unknown TargetModels for Query-Efficient Black-box Attacks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,Piscataway.2021:11835-11844.
[41]XU X,CHEN J,XIAO J,et al.Learning optimization-based adversarial perturbations for attacking sequential recognition models[C]//Proceedings of the 28th ACM International Conference on Multimedia.New York,2020:2802-2822.
[42]CHEN P Y,ZHANG H,SHARMA Y,et al.Zoo:Zeroth order optimization based black-box attacks to deep neural networks without training substitute models[C]//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security.New York,2017:15-26.
[43]KINGMA D P,BA J.Adam:A method for stochastic optimization[OL].(2017-01-30) [2021-08-12].https://arxiv.org/pdf/1412.6980.pdf.
[44]CHENG S,DONG Y,PANG T,et al.Improving black-box adversarial attacks with a transfer-based prior[OL].(2020-07-26) [2021-08-12].https://arxiv.org/pdf/1906.06919.pdf.
[45]DING K,LIU X,NIU W,et al.A low-query black-box adversar-ial attack based on transferability[J].Knowledge-Based Systems,2021,226:107102.
[46]XIAO C,LI B,ZHU J Y,et al.Generating adversarial examples with adversarial networks[OL].(2019-02-14) [2021-08-12].https://arxiv.org/pdf/1801.02610.pdf.
[47]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[J].Advances in Neural Information Processing Systems,2014,27:2672-2680.
[48]ZHOU M,WU J,LIU Y,et al.Dast:Data-free substitute training for adversarial attacks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscata-way,2020:234-243.
[49]MOOSAVI-DEZFOOLI S-M,FAWZI A,FROSSARD P.Deepfool:a simple and accurate method to fool deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2016:2574-2582.
[50]SU J,VARGAS D V,SAKURAI K.One pixel attack for fooling deep neural networks[J].IEEE Transactions on Evolutionary Computation,2019,23(5):828-841.
[51]ADAM G,SMIRNOV P,HAIBE-KAINS B,et al.Reducing adversarial example transferability using gradient regularization[OL].(2019-04-16) [2021-08-12].https://arxiv.org/pdf/1904.07980.pdf.
[52]BUADES A,COLL B,MOREL J M.A non-local algorithm for image denoising[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05).Piscataway.2005:60-65.
[53]TOMASI C,MANDUCHI R.Bilateral filtering for gray and color images[C]//Sixth international conference on computer vision(IEEE Cat.No.98CH36271).Piscataway,1998:839-846.
[54]ZHANG P,LI F.A new adaptive weighted mean filter for removing salt-and-pepper noise[J].IEEE Signal Processing Letters,2014,21(10):1280-1283.
[55]IBRAHIM H,KONG N S P,NG T F.Simple adaptive median filter for the removal of impulse noise from highly corrupted images[J].IEEE Transactions on Consumer Electronics,2008,54(4):1920-1927.
[56]LI Y,WANG Y.Defense against adversarial attacks in deeplearning[J].Applied Sciences,2019,9(1):76.
[57]NIU Z,CHEN Z,LI L,et al.On the Limitations of Denoising Strategies as Adversarial Defenses[OL].(2020-12-17)[2021-08-12].https://arxiv.org/pdf/2012.09384.pdf.
[58]PRAKASH A,MORAN N,GARBER S,et al.Deflecting adversarial attacks with pixel deflection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2018:8571-8580.
[59]PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as adefense to adversarial perturbations against deep neural networks[C]//2016 IEEE Symposium on Security and Privacy(SP).Piscataway,2016:582-597.
[60]NESTI F,BIONDI A,BUTTAZZO G.Detecting AdversarialExamples by Input Transformations,Defense Perturbations,and Voting[OL].(2021-01-27) [2021-08-12].https://arxiv.org/pdf/2101.11466.pdf.
[61]LEE H,PHAM P,LARGMAN Y,et al.Unsupervised feature learning for audio classification using convolutional deep belief networks[J].Advances in Neural Information Processing Systems,2009,22:1096-1104.
[62]WEI X,LIANG S,CHEN N,et al.Transferable adversarial attacks for image and video object detection[OL].(2019-05-13)[2021-08-12].https://arxiv.org/pdf/1811.12641.pdf.2021.8.
[63]HENDRIK METZEN J,CHAITHANYA KUMAR M,BROXT,et al.Universal adversarial perturbations against semantic image segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway,2017:2755-2764.
[64]SHARIF M,BHAGAVATULA S,BAUER L,et al.Accessorize to a crime:Real and stealthy attacks on state-of-the-art facereco-gnition[C]//Proceedings of the 2016 ACM Sigsac Conference on Computer and Communications Security.New York,2016:1528-1540.
[65]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[66]FISCHER V,KUMAR M C,METZEN J H,et al.Adversarial examples for semantic image segmentation[OL].(2017-03-03) [2021-08-12].https://arxiv.org/pdf/1703.01101.pdf.
[67]ARNAB A,MIKSIK O,TORR P H.On the robustness of semantic segmentation models to adversarial attacks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2018:888-897.
[68]EVERINGHAM M,VAN GOOL L,WILLIAMS C K,et al.The pascal visual object classes(voc) challenge[J].International Journal of Computer Vision,2010,88(2):303-338
[69]CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2016:3213-3223.
[70]KAPOOR N,BÄR A,VARGHESE S,et al.From a Fourier-Domain Perspective on Adversarial Examples to a Wiener Filter Defense for Semantic Segmentation[OL].(2021-04-21) [2021-08-12].https://arxiv.org/pdf/2012.01558.pdf.
[71]CARLINI N,WAGNER D.Audio adversarial examples:Targeted attacks on speech-to-text[C]//2018 IEEE Security and Privacy Workshops(SPW).Piscataway,2018:1-7.
[72]KWON H,YOON H,PARK K W.Acoustic-decoy:Detection of adversarial examples through audio modification on speech re-cognition system[J].Neurocomputing,2020,417:357-370.
[73]LI J,JI S,DU T,et al.Textbugger:Generating adversarial text against real-world applications[OL].(2018-12-13) [2021-08-12].https://arxiv.org/pdf/1812.05271.pdf.
[74]LI L,MA R,GUO Q,et al.Bert-attack:Adversarial attackagainst bert using bert[OL].(2020-10-02) [2021-08-12].https://arxiv.org/pdf/2004.09984.pdf.
[75]ZANG Y,QI F,YANG C,et al.Word-level textual adversarial attacking as combinatorial optimization[OL].(2020-12-09) [2021-08-12].https://arxiv.org/pdf/1910.12196.pdf.
[76]JIN D,JIN Z,ZHOU J T,et al.Is bert really robust? a strong baseline for natural language attack on text classification and entailment[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Menlo Park,2020:8018-8025.
[77]WANG X,JIN H,HE K.Natural language adversarial attacks and defenses in word level[OL].(2021-06-15)[2021-08-12].https://arxiv.org/pdf/1909.06723.pdf.
[78]ZHOU Y,JIANG J Y,CHANG K W,et al.Learning to discriminate perturbations for blocking adversarial attacks in text classification[OL].(2019-09-06)[2021-08-12].https://arxiv.org/pdf/1909.03084.pdf.
[79]LU J,SIBAI H,FABRY E.Adversarial examples that fool detectors[OL].(2017-12-07)[2021-08-12].https://arxiv.org/pdf/1712.02494.pdf.
[80]LIU X,YANG H,LIU Z,et al.Dpatch:An adversarial patch attack on object detectors[OL].(2019-04-23) [2021-08-12].https://arxiv.org/pdf/1806.02299.pdf.
[81]LI H,LI G,YU Y.ROSA:Robust salient object detectionagainst adversarial attacks[J].IEEE Transactions on Cyberne-tics,2019,50(11):4835-4847.
[82]DONG Y,SU H,WU B,et al.Efficient decision-based black-box adversarial attacks on face recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,2019:7714-7722.
[83]KOMKOV S,PETIUSHKO A.Advhat:Real-world adversarial attack on arcface face id system[OL].(2019-08-23) [2021-08-12].https://arxiv.org/pdf/1908.08705.pdf.
[84]XIANG Y,NIU W,LIU J,et al.A PCA-based model to predict adversarial examples on Q-learning of path finding[C]//2018 IEEE Third International Conference on Data Science in Cyberspace(DSC).Piscataway,2018:773-780.
[85]QU X,SUN Z,ONG Y S,et al.Minimalistic Attacks:How Little it Takes to Fool Deep Reinforcement Learning Policies[J].IEEE Transactions on Cognitive and Developmental Systems,2020,13(4):806-817.
[86]BEHZADAN V,MUNIR A.Whatever does not kill deep reinforcement learning,makes it stronger[OL].(2017-12-23)[2021-08-12].https://arxiv.org/pdf/1712.09344.pdf.
[87]HAVENS A J,JIANG Z,SARKAR S.Online robust policylearning in the presence of unknown adversaries[OL].(2018-07-16) [2021-08-12].https://arxiv.org/pdf/1807.06064.pdf.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed