计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 425-433.doi: 10.11896/jsjkx.240900007
郭嘉铭1, 杜文韬1, 杨超2,3
GUO Jiaming1, DU Wentao1, YANG Chao2,3
摘要: 深度神经网络易受后门攻击,攻击者可以通过数据投毒的方式植入后门并劫持模型的行为。其中,类特定攻击映射关系复杂、与正常任务关联紧密,因而能绕过大多数防御方法,具有更高的威胁性。文中研究了类特定攻击在植入后门的过程中攻击成功率与模型分类性能的关系,总结出3条性质,并以此为基础设计了一种针对类特定攻击的样本过滤方法。该方法使用深度分区聚合(Deep Partition Aggregation,DPA)的集成学习方法与投票法对数据集进行反复迭代过滤。根据类特定攻击的3条性质,从数学层面证明了该过滤方法的有效性,并在标准分类数据集上进行了大量实验,在迭代4轮后均能过滤95%以上的后门样本。同时,与最新的样本过滤方法的对比实验结果,体现了所提过滤方法在针对类特定攻击时的优越性。文中实验基于Github的开源项目backdoorbox开展。
中图分类号:
| [1]BROWN A,HUH J,CHUNG J S,et al.VoxSRC 2021:The Third VoxCeleb Speaker Recognition Challenge[J].arXiv:2201.04583,2022. [2]QIU X P,SUN T X,XU Y G,et al.Pre-trained models for natural language processing:A survey[J].Science China(Technological Sciences),2020,63(10):1872-1897. [3]BISONG E.Building Machine Learning and Deep Learning Models on Google Cloud Platform:A Comprehensive Guide for Beginners[M].Berkely:Apress,2019. [4]YAN B,LAN J,YAN Z.Backdoor attacks against voice recognition systems:A survey[J].arXiv:2307.13643,2023. [5]LI Y,JIANG Y,LI Z,et al.Backdoor learning:A survey[J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(1):5-22. [6]GAO Y,DOAN B G,ZHANG Z,et al.Backdoor attacks and countermeasures on deep learning:A comprehensive review[J].arXiv:2007.10760,2020. [7]JAVAHERIPI M,SAMRAGH M,FIELDS G,et al.Cleann:Accelerated trojan shield for embedded neural networks[C]//Proceedings of the 39th International Conference on Computer-Aided Design.2020:1-9. [8]TIAN Z,CUI L,LIANG J,et al.A comprehensive survey on poisoning attacks and countermeasures in machine learning[J].ACM Computing Surveys,2022,55(8):1-35. [9]GU T,LIU K,DOLAN-GAVITT B,et al.Evaluating Backdooring Attacks on Deep Neural Networks[J].IEEE Access,2019,7:47230-47244. [10]WANG B,YAO Y,SHAN S,et al.Neural cleanse:Identifying and mitigating backdoor attacks in neural networks[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:707-723. [11]DONG Y,YANG X,DENG Z,et al.Black-box detection ofbackdoor attacks with limited information and data[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:16482-16491. [12]CHOU E,TRAMER F,PELLEGRINO G.Sentinet:Detectinglocalized universal attacks against deep learning systems[C]//2020 IEEE Security and Privacy Workshops(SPW).IEEE,2020:48-54. [13]GUO J,LI Y,CHEN X,et al.Scale-up:An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency[J].arXiv:2302.03251,2023. [14]HOU L,FENG R,HUA Z,et al.IBD-PSC:Input-level Backdoor Detection via Parameter-oriented Scaling Consistency[J].arXiv:2405.09786,2024. [15]LEVINE A,FEIZI S.Deep partition aggregation:Provable de-fense against general poisoning attacks[J].arXiv:2006.14768,2020. [16]KRIZHEVSKY A.Learning multiple layers of features from tiny images[J/OL].http://www.cs.toronto.edu/~kriz/lear-ning-features-2009-TR.pdf. [17]SAADNA Y,BEHLOUL A.An overview of traffic sign detection and classification methods[J].International Journal of Multimedia Information Retrieval,2017,6:193-210. [18]CHEN X,LIU C,LI B,et al.Targeted backdoor attacks on deep learning systems using data poisoning[J].arXiv:1712.05526,2017. [19]LI Y,ZHAI T,JIANG Y,et al.Backdoor attack in the physical world[J].arXiv:2104.02361,2021. [20]NGUYEN A,TRAN A.Wanet-imperceptible warping-basedbackdoor attack[J].arXiv:2102.10369,2021. [21]DOAN K,LAO Y,ZHAO W,et al.Lira:Learnable,imperceptible and robust backdoor attacks[C]//Proceedings of the IEEE/CVF international conference on computer vision.2021:11966-11976. [22]SOURI H,FOWL L,CHELLAPPA R,et al.Sleeper agent:Scalable hidden trigger backdoors for neural networks trained from scratch[J].Advances in Neural Information Processing Systems,2022,35:19165-19178. [23]TRAN B,LI J,MADRY A.Spectral Signatures in Backdoor Attacks[J].arXiv:1811.00636,2018. [24]HAYASE J,KONG W.SPECTRE:Defending against backdoor attacks using robust covariance estimation[C]//International Conference on Machine Learning.2021:4129-4139. [25]ZENG Y,PARK W,MAO Z M,et al.Rethinking the backdoorattacks' triggers:A frequency perspective[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:16473-16481. [26]HUANG H,MA X,ERFANI S,et al.Distilling cognitive backdoor patterns within an image[J].arXiv:2301.10908,2023. [27]AMARNATH C,BALWANI A H,MA K,et al.Tesda:Transform enabled statistical detection of attacks in deep neural networks[J].arXiv:2110.08447,2021. [28]CHEN B,CARVALHO W,BARACALDO N,et al.Detectingbackdoor attacks on deep neural networks by activation clustering[J].arXiv:1811.03728,2018. [29]LIU G,KHREISHAH A,SHARADGAH F,et al.An adaptive black-box defense against trojan attacks(trojdef)[J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(4):5367-5381. [30]CHEN W,WU B,WANG H.Effective backdoor defense by exploiting sensitivity of poisoned samples[J].Advances in Neural Information Processing Systems,2022,35:9727-9737. [31]LECUN Y,JACKEL L D,BOTTOU L,et al.Learning algo-rithms for classification:A comparison on handwritten digit re-cognition[J].Neural Networks:the Statistical Mechanics Perspective,1995,261(276):2. [32]BREIMAN L.Bagging Predictors[J].Machine Learning,1996,24:123-140. |
|
||