计算机科学 ›› 2026, Vol. 53 ›› Issue (1): 382-394.doi: 10.11896/jsjkx.241200105
黄荣1,2, 唐迎春1, 周树波1,2, 蒋学芹1,2
HUANG Rong1,2, TANG Yingchun1, ZHOU Shubo1,2 , JIANG Xueqin1,2
摘要: 后门攻击指攻击者通过毒化数据集,隐蔽地诱导受害模型关联中毒数据和目标标签,对人工智能技术的可信和安全产生威胁。现有后门攻击方法普遍存在着有效性和隐蔽性之间顾此失彼的矛盾,有效性强的触发器隐蔽性差,反之,隐蔽性好的触发器有效性弱。针对该问题,提出一种联合视觉-文本特征的复合型触发器净标签后门攻击。复合型触发器由通用型和个性化两部分可学习的触发器叠加而成。复合型触发器的设计和优化均以块内像素值的同余为约束,旨在诱导受害模型捕捉同余规律,建立起触发器和目标标签的关联,形成后门。通用型触发器使得中毒图像的块内像素值对位权2同余,其信号形态对于所有的中毒图像单一固定;个性化触发器使得中毒图像的边缘像素值对LoSB(Lower Significant Bit)的位权同余,其信号特定于图像的边缘位置。两部分触发器相叠加,有利于兼顾有效性和隐蔽性。在此基础上,引入CLIP(Contrastive Language-Image Pre-training)模型,联合视觉和文本特征构建驱动复合型触发器训练的监督信号。预训练的CLIP模型具有较强的泛化能力,能够引导复合型触发器吸收异类的文本特征,起到弱化图像内容特征的作用,进一步增强触发器的有效性。在CIFAR-10,ImageNet,GTSRB这3个数据集上开展了实验。结果表明,所提方法能够抵御后门防御技术的侦测,在攻击成功率指标上平均超越次优方法2.48个百分点;在峰值信噪比、结构相似性度量、梯度幅度相似性偏差和学习感知图像块相似度4项指标上分别平均超越次优方法10.61%,0.31%,68.44%和46.38%。消融实验的结果验证了联合视觉和本文特征引导复合型触发器训练的优势,还验证了通用型和个性化两部分触发器对后门攻击的有效性和隐蔽性。
中图分类号:
| [1]YUAN L,CHEN Y,CUI G,et al.Revisiting Out-of-Distribution Robustness in NLP:Benchmarks,Analysis,and LLMs Evaluations[C]//Advances in Neural Information Processing Systems.MIT,2023:58478-58507. [2]CHEN J,TAM D,RAFFEL C,et al.An Empirical Survey of Data Augmentation for Limited Data Learning in NLP[J].Transactions of the Association for Computational Linguistics,2023,11:191-211. [3]LENG Y,TAN X,ZHU L,et al.Fastcorrect:Fast Error Correction with Edit Alignment for Automatic Speech Recognition[C]//Advances in Neural Information Processing Systems.MIT,2021:21708-21719. [4]KHEDDAR H,HEMIS M,HIMEUR Y.Automatic SpeechRecognition Using Advanced Deep Learning Approaches:A Survey[J].Information Fusion,2024,109:102422. [5]YU J,YIN H,XIA X,et al.Self-Supervised Learning for Re-commender Systems:A Survey[J].2023 IEEE Transactions on Knowledge and Data Engineering,2023,36(1):335-355. [6]RAJPUT S,MEHTA N,SINGH A,et al.Recommender Sys-tems with Generative Retrieval[C]//Advances in Neural Information Processing Systems.MIT,2023:10299-10315. [7]QIU H,YU B,GONG D,et al.Synface:Face Recognition with Synthetic Data[C]//Proceedings of the IEEE/CVF Internatio-nal Conference on Computer Vision.IEEE,2021:10880-10890. [8]ZHANG S,GONG Y H,WANG J J.The Development of Deep Convolution Neural Network and Its Applications on Computer Vision[J].Chinese Journal of Computers,2018,41(7):1619-1647. [9]DU W,LIU G S.A Survey of Backdoor Attack in Deep Learning[J].Journal of Cyber Security,2022,7(3):1-16. [10]HUANG S X,ZHANG Q X,WANG Y J,et al.Research Progress of Backdoor Attacks in Deep Neural Networks[J].Computer Science,2023,50(9):52-61. [11]GU T Y,LIU K,DOLAN-GAVITT B,et al.BadNets:Evaluating Backdooring Attacks on Deep Neural Networks[J].IEEE Access,2019,7:47230-47244. [12]CHEN X Y,LIU C,LI B,et al.Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning[J].arXiv:1712.05526,2017. [13]NGUYEN A,TRAN A.WaNet-Imperceptible Warping-BasedBackdoor Attack[C]//The 9th International Conference on Learning Representations.2021. [14]LI S,XUE M,ZHAO B Z H,et al.Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization[J].IEEE Transactions on Dependable and Secure Computing,2020,18(5):2088-2105. [15]LI Y M,LI Y M,WU B Y,et al.Invisible Backdoor Attack with Sample-Specific Triggers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2021:16463-16472. [16]ZHONG N,QIAN Z,ZHANG X.Imperceptible Backdoor Attack:From Input Space to Feature Representation[C]//Proceedings of the 31th International Joint Conference on Artificial Intelligence.Morgan Kaufmann,2022:1736-1742. [17]ZHANG J,DONGDONG C,HUANG Q,et al.Poison Ink:Robust and Invisible Backdoor Attack[J].IEEE Transactions on Image Processing,2022,31:5691-5705. [18]TURNER A,TSIPRAS D,MADRY A.Label-Consistent Backdoor Attacks[J].arXiv:1912.02771,2019. [19]BARNI M,KALLAS K,TONDI B.A New Backdoor Attack in CNNS by Training Set Corruption without Label Poisoning[C]//2019 IEEE International Conference on Image Proces-sing.IEEE,2019:101-105. [20]LIU Y,MA X,BAILEY J,et al.Reflection Backdoor:A Natural Backdoor Attack on Deep Neural Networks[C]//The 16th European Conference on Computer Vision.Springer,2020:182-199. [21]NING R,LI J,XIN C S,et al.Invisible Poison:A BlackboxClean Label Backdoor Attack to Deep Neural Networks[C]//Proceedings of 2021 IEEE Conference on Computer Communications.IEEE,2021:1-10. [22]ZHU S W,LUO G,WEI P,et al.Image-Imperceptible Backdoor Attacks[J].Journal of Image and Graphics,2023,28(3):864-877. [23]SAHA A,SUBRAMANYA A,PIRS-IAVASH H.Hidden Trigger Backdoor Attacks[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence.AAAI,2020:11957-11965. [24]SOURI H,FOWL L,CHELLAPPA R,et al.Sleeper Agent:Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch[C]//Advances in Neural Information Processing Systems.MIT,2022:19165-19178. [25]XU C,LIU W,ZHENG Y,et al.An Imperceptible Data Augmentation Based Blackbox Clean-Label Backdoor Attack on Deep Neural Networks[J].IEEE Transactions on Circuits and Systems,2023,70(12):2011-5024. [26]RADFORD A,KIM J W,HALLACY C,et al.Learning Transferable Visual Models from Natural Language Supervision[C]//Proceedings of the 38th International Conference on Machine Learning.2021:8748-8763. [27]TRAN B,LI J,MADRY A.Spectral Signatures in BackdoorAttacks[C]//Advances in Neural Information Processing Systems.MIT,2018:8011-8021. [28]GAO Y S,XU C G,WANG D R,et al.STRIP:A Defenceagainst Trojan Attacks on Deep Neural Networks[C]//The 35th Annual Computer Security Applications Conference.IEEE,2019:113-125. [29]CHOU E,TRAMER F,PELLEGRINO G.SentiNet:Detecting Localized Universal Attacks Against Deep Learning Systems[C]//2020 IEEE Security and Privacy Workshops.IEEE,2020:48-54. [30]SELVARAJUR R,COGSWELL M,DAS A,et al.Grad-Cam:Visual Explanations from Deep Networks via Gradient-Based Localization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2017:618-626. [31]DOAN B G,ABBASNEJAD E,RANASINGHE D C.Februus:Input Purifycation Defense against Trojan Attacks on Deep Neural Network Systems[C]//Annual Computer Security Applications Conference.ACM,2020:897-912. [32]CHEN B,CARVALHO W,BARACALDO N,et al.DetectingBackdoor Attacks on Deep Neural Networks by Activation Clustering[C]//2019 Proceedings of the Workshop on Artificial Intelligence.AAAI,2019. [33]LI Y,LYU X,KOREN N,et al.Anti-backdoor Learning:Training Clean Models on Poisoned Data[C]//Advances in Neural Information Processing Systems.MIT,2021:14900-14912. [34]ZHENG R,TANG R,LI J,et al.Pre-activation DistributionsExpose Backdoor Neurons[C]//Advances in Neural Information Processing Systems.MIT,2022:18667-18680. [35]LIU K,DOLAN-GAVITT B,GARG S.Fine-Pruning:Defending against Backdooring Attacks on Deep Neural Networks[C]//Research in Attacks,Intrusions,and Defenses.Springer,2018:273-294. [36]WANG B L,YAO Y S,SHAN S,et al.Neural Cleanse:Identi-fying and Mitigating Backdoor Attacks in Neural Networks[C]//2019 IEEE Symposium on Security and Privacy.IEEE,2019:707-723. [37]ZENG Y,CHEN S,PARK W,et al.Adversarial Unlearning of Backdoors via Implicit Hypergradient[C]//The 10th International Conference on Learning Representations.2022. [38]KRIZHEVSKY A,HINTON G.Learning Multiple Layers ofFeatures from Tiny Images.[EB/OL].[2024-12-13].https://www.cs.Utoronto.ca/~kriz/learning-features-2009-TR.pdf. [39]DENG J,DONG W,SOCHER R,et al.ImageNet:A Large-Scale Hierarchical Image Database[C]//Proceedings of IEEE Confe-rence on Computer Vision and Pattern Recognition.IEEE,2009:248-255. [40]STALLKAMP J,SCHLIPSING M,SALMEN J,et al.The German Traffic Sign Recognition Benchmark:A Multi-Class Classification Competition[C]//The 2011 International Joint Confe-rence on Neural Networks.IEEE,2011:1453-1460. [41]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,2014. [42]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of 2016 IEEE Confe-rence on Computer Vision and Pattern Recognition.IEEE,2016:770-778. [43]HUANG G,LIU Z,VAN DER MAATEN L,et al.DenselyConnected Convolutional Networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:4700-4708. [44]ZAGORUYKO S,KOMODAKIS N.Wide Residual Networks[J].arXiv:1605.07146,2016. [45]LOSHCHILOV I,HUTTER F.SGDR:Stochastic Gradient Descent with Warm Restarts[C]//The 5th International Confe-rence on Learning Representations.2017. [46]HUYNH-THU Q,GHANBARI M.Scope of Validity of PSNR in Image/Video Quality Assessment[J].Electronics Letters,2008,44(13):800-801. [47]WANG Z,BOVIK A C,SHEIKH H R,et al.Image Quality Assessment:From Error Visibility to Structural Similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612. [48]XUE W,ZHANG L,MOU X,et al.Gradient Magnitude Similarity Deviation:A Highly Efficient Perceptual Image Quality Index[J].IEEE Transactions on Image Processing,2013,23(2):684-695. [49]ZHANG R,ISOLA P,EFROS A A,et al.The UnreasonableEffectiveness of Deep Features as A Perceptual Metric[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:586-595. |
|
||