计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 374-383.doi: 10.11896/jsjkx.250300064
陈先意1,2,3, 张成娟2, 钱江峰4, 郭倩彬2, 崔琦1,2, 付章杰1,2
CHEN Xianyi1,2,3, ZHANG Chengjuan2, QIAN Jiangfeng4, GUO Qianbin2, CUI Qi1,2, FU Zhangjie1,2
摘要: 模型后门攻击通常将待触发后门隐藏在模型参数中,而在其构造的特定样本下激活预设输出。但这类方法容易遭受参数净化等防御技术的削弱,导致后门难以触发。为此,首次基于特征分布设计后门触发机制,构建了不依赖模型参数的结构后门,从而实现高隐蔽、高鲁棒的后门植入。首先,使用模型特征空间中的分布式触发器生成后门图像,使得后门激活更加稳定,从而提升攻击可靠性;其次,构建由分布检测器和后门寄存器组成的后门结构,并嵌入至目标层中,该结构化后门不依赖模型参数,可显著增强后门的鲁棒性和抗检测性;最后,利用分布检测器提取分布式触发模式,同时后门寄存器将被激活并完成模型特征污染,从而确保后门在预期条件下精确触发,使得后门效果更具针对性。实验结果表明,所提方法在模型经历20轮参数修改后依然能够达到100%的攻击成功率,且能够躲避多种先进的后门检测器。
中图分类号:
| [1]LAURIOLA I,LAVELLI A,AIOLLI F.An introduction todeep learning in natural language processing:Models,techniques,and tools[J].Neurocomputing,2022,470:443-456. [2]MIN B,ROSS H,SULEM E,et al.Recent advances in natural language processing via large pre-trained language models:A survey[J].ACM Computing Surveys,2023,56(2):1-40. [3]ZAHRA A,PERWAIZ N,SHAHZAD M,et al.Person re-identification:A retrospective on domain specific open challenges and future trends[J].Pattern Recognition,2023,142:109669. [4]CHIB P S,SINGH P.Recent advancements in end-to-end auto-nomous driving using deep learning:A survey[J].IEEE Transactions on Intelligent Vehicles,2023,9(1):103-118. [5]MENGARA O,AVILA A,FALK T H.Backdoor Attacks toDeep Neural Networks:A Survey of the Literature,Challenges,and Future Research Directions[J].IEEE Access,2024,12:29004-29023. [6]LI Y,ZHANG S,WANG W,et al.Backdoor attacks to deeplearning models and countermeasures:A survey[J].IEEE Open Journal of the Computer Society,2023,4:134-146. [7]LI Y,JIANG Y,LI Z,et al.Backdoor learning:A survey[J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(1):5-22. [8]GUO W,TONDI B,BARNI M.An overview of backdoor at-tacks against deep neural networks and possible defences[J].IEEE Open Journal of Signal Processing,2022,3:261-287. [9]GU T,DOLAN-GAVITT B,GARG S.Badnets:Identifying vu-lnerabilities in the machine learning model supply chain[J].ar-Xiv:1708.06733,2017. [10]CHEN X,LIU C,LI B,et al.Targeted backdoor attacks on deep learning systems using data poisoning[J].arXiv:1712.05526,2017. [11]LI S,XUE M,ZHAO B Z H,et al.Invisible backdoor attacks on deep neural networks via steganography and regularization[J].IEEE Transactions on Dependable and Secure Computing,2020,18(5):2088-2105. [12]CHENG S,TAO G,LIU Y,et al.Lotus:Evasive and resilient backdoor attacks through sub-partitioning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:24798-24809. [13]HUANG Y,XU J F,GUO Q,et al.Personalization as a shortcut for few-shot backdoor attack against text-to-image diffusion models[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:21169-21178. [14]NGUYEN T A,TRAN A.Input-aware dynamic backdoor attack[J].Advances in Neural Information Processing Systems,2020,33:3454-3464. [15]ZOU M,SHI Y,WANG C,et al.Potrojan:powerful neural-level trojan designs in deep learning models[J].arXiv:1802.03043,2018. [16]QI X,XIE T,PAN R,et al.Towards practical deployment-stage backdoor attack on deep neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:13347-13357. [17]BOBER-IRIZAR M,SHUMAILOV I,ZHAO Y,et al.Architectural backdoors in neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:24595-24604. [18]CLIFFORD E,SHUMAILOV I,ZHAO Y,et al.ImpNet:Imperceptible and blackbox-undetectable backdoors in compiled neural networks[C]//2024 IEEE Conference on Secure and Trustworthy Machine Learning(SaTML).IEEE,2024:344-357. [19]GAO Y,XU C,WANG D,et al.Strip:A defence against trojan attacks on deep neural networks[C]//Proceedings of the 35th Annual Computer Security Applications Conference.2019:113-125. [20]DOAN B G,ABBASNEJAD E,RANASINGHE D C.Februus:Input purification defense against trojan attacks on deep neural network systems[C]//Proceedings of the 36th Annual Compu-ter Security Applications Conference.2020:897-912. [21]WANG B,YAO Y,SHAN S,et al.Neural cleanse:Identifying and mitigating backdoor attacks in neural networks[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:707-723. [22]ZHENG R,TANG R,LI J,et al.Pre-activation distributions expose backdoor neurons[J].Advances in Neural Information Processing Systems,2022,35:18667-18680. [23]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014. [24]LECUN Y.The MNIST database of handwritten digits[EB/OL].http://yann.lecun.com/exdb/mnist/. [25]KRIZHEVSKY A,HINTON G.Learning multiple layers of features from tiny images:TR-2009[R].2009. |
|
||