计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 336-342.doi: 10.11896/jsjkx.240100005
王一飞1, 张胜杰1, 薛迪展2, 钱胜胜2
WANG Yifei1, ZHANG Shengjie1, XUE Dizhan2, QIAN Shengsheng2
摘要: 近年来,自监督学习网络(Self-Supervised Learning,SSL)在深度学习领域迅速崛起,成为该领域发展的主要动力,特别是预训练图像模型和大规模语言模型(Large Language Model,LLM)的出现,引起了全球范围内的广泛关注。但是最近的研究发现,自监督学习网络容易受到后门攻击的影响。攻击者可以通过在训练数据集中加入少量带有恶意后门的样本,来操控预训练模型在下游任务中的表现。为了防御这种SSL后门攻击,提出了一种基于带毒分类器的自监督后门攻击防御方法,称为DPC(Defending by Poisoned Classifier)。通过获取在被污染数据集上训练的威胁模型,所提方法可以准确地检测出有毒样本。实验结果显示,假设屏蔽后门触发器可以有效地改变下游聚类模型的激活状态,DPC防御方法在实验中达到了91.5%的后门触发器检测召回率以及27.4%的精准率,超过了原来的SOTA方法。这表明该方法在检测潜在威胁方面具有出色的性能,为自监督学习网络的安全性提供了有效的保障。
中图分类号:
[1]JAISWAL A,BABU A R,ZADEH M Z,et al.A survey on contrastive self-supervised learning[J].Technologies,2020,9(1):2. [2]KRISHNAN R,RAJPURKAR P,TOPOL E J.Self-supervised learning in medicine and healthcare[J].Nature Biomedical Engineering,2022,6(12):1346-1352. [3]LIU X,ZHANG F,HOU Z,et al.Self-supervised learning:Generative or contrastive[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(1):857-876. [4]MISRA I,MAATEN L.Self-supervised learning of pretext-in-variant representations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6707-6717. [5]SCHIAPPA M C,RAWAT Y S,SHAH M.Self-supervisedlearning for videos:A survey[J].ACM Computing Surveys,2023,55(13s):1-37. [6]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//International Conference on Machine Learning.PMLR,2020:1597-1607. [7]CHEN X,FAN H,GIRSHICK R,et al.Improved baselines with momentum contrastive learning[J].arXiv:2003.04297,2020. [8]CHEN X,XIE S,HE K.An empirical study of training self-supervised vision transformers[C]//CVF International Conference on Computer Vision(ICCV).IEEE,2021:9620-9629. [9]GRILL J B,STRUB F,ALTCHÉ F,et al.Bootstrap your own latenta new approach to self-supervised learning[J].Advances in Neural Information Processing Systems,2020,33:21271-21284. [10]HE K,FAN H,WU Y,et al.Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9729-9738. [11]CARLINI N,TERZIS A.Poisoning and backdooring contrastive learning[J].arXiv:2106.09667,2021. [12]SAHA A,TEJANKAR A,KOOHPAYEGANI S A,et al.Backdoor attacks on self-supervised learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:13337-13346. [13]LIU M,SANGIOVANNI-VINCENTELLI A,YUE X.Beating Backdoor Attack at Its Own Game[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:4620-4629. [14]MU B,NIU Z,WANG L,et al.Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:20495-20503. [15]PANG L,SUN T,LING H,et al.Backdoor cleansing with unlabeled data[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12218-12227. [16]QI X,XIE T,WANG J T,et al.Towards a proactive ML approach for detecting backdoor poison samples[C]//32nd USENIX Security Symposium(USENIX Security 23).2023:1685-1702. [17]TEJANKAR A,SANJABI M,WANG Q,et al.DefendingAgainst Patch-based Backdoor Attacks on Self-Supervised Lear-ning[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2023:12239-12249. [18]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:618-626. [19]DOSOVITSKIY A,SPRINGENBERG J T,RIEDMILLER M,et al.Discriminative unsupervised feature learning with convolutional neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.2014:766-774. [20]GIDARIS S,SINGH P,KOMODAKIS N.Unsupervised representation learning by predicting image rotations[J].arXiv:1803.07728,2018. [21]NOROOZI M,FAVARO P.Unsupervised learning of visual representations by solving jigsaw puzzles[C]//European Confe-rence on Computer Vision.Cham:Springer International Publi-shing,2016:69-84. [22]WU Z,XIONG Y,YU S X,et al.Unsupervised feature learning via non-parametric instance discrimination[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3733-3742. [23]ZHANG R,ISOLA P,EFROS A A.Colorful image colorization[C]//Computer Vision-ECCV 2016:14th European Confe-rence,Amsterdam,The Netherlands,October 11-14,2016,Proceedings,Part III 14.Springer International Publishing,2016:649-666. [24]CARON M,MISRA I,MAIRAL J,et al.Unsupervised learning of visual features by contrasting cluster assignments[J].Advances in Neural Information Processing Systems,2020,33:9912-9924. [25]CHUANG C Y,ROBINSON J,LIN Y C,et al.Debiased contrastive learning[J].Advances in Neural Information Processing Systems,2020,33:8765-8775. [26]SHAH A,SRA S,CHELLAPPA R,et al.Max-margin contrastive learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022,36(8):8220-8230. [27]YOU Y,CHEN T,SUI Y,et al.Graph contrastive learning with augmentations[J].Advances in Neural Information Processing Systems,2020,33:5812-5823. [28]JIA J,LIU Y,GONG N Z.Badencoder:Backdoor attacks to pre-trained encoders in self-supervised learning[C]//2022 IEEE Symposium on Security and Privacy(SP).IEEE,2022:2043-2059. [29]TAO G,WANG Z,FENG S,et al.Distribution preserving backdoor attack in self-supervised learning[C]//2024 IEEE Symposium on Security and Privacy(SP).IEEE Computer Society,2023. [30]WANG Q,YIN C,FANG L,et al.SSL-OTA:Unveiling Backdoor Threats in Self-Supervised Learning for Object Detection[J].arXiv:2401.00137,2023. [31]LI C,PANG R,XI Z,et al.An embarrassingly simple backdoor attack on self-supervised learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:4367-4378. [32]RADFORD A,KIM J W,HALLACY C,et al.Learning transferable visual models from natural language supervision[C]//International Conference on Machine Learning.PMLR,2021:8748-8763. [33]HUANG K,LI Y,WU B,et al.Backdoor defense via decoupling the training process[J].arXiv:2202.03423,2022. [34]MIN R,QIN Z,SHEN L,et al.Towards stable backdoor purification through feature shift tuning[C]//Proceedings of the 37th International Conference on Neural Information Processing Systems.2024:75286-75306. [35]XU Q,TAO G,HONORIO J,et al.MEDIC:Remove ModelBackdoors via Importance Driven Cloning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:20485-20494. [36]ZHANG Z,LIU Q,WANG Z,et al.Backdoor Defense via Deconfounded Representation Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12228-12238. [37]ZHU M,WEI S,ZHA H,et al.Neural polarizer:A lightweight and effective backdoor defense via purifying poisoned features[C]//NeurIPS 2023.2023. [38]BANSAL H,SINGHI N,YANG Y,et al.CleanCLIP:Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning[J].arXiv:2303.03323,2023. [39]HONG S,CHANDRASEKARAN V,KAYA Y,et al.On the effectiveness of mitigating data poisoning attacks with gradient shaping[J].arXiv:2002.11497,2020. [40]YUN S,HAN D,CHUN S,et al.CutMix:Regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6023-6032. [41]CHATTOPADHAY A,SARKAR A,HOWLADER P,et al.Grad-CAM++:Generalized gradient-based visual explanations for deep convolutional networks[C]//2018 IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE,2018:839-847. [42]JIANG P T,ZHANG C B,HOU Q,et al.Layercam:Exploring hierarchical class activation maps for localization[J].IEEE Transactions on Image Processing,2021,30:5875-5888. [43]WANG H,WANG Z,DU M,et al.Score-CAM:Score-weighted visual explanations for convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:24-25. [44]TIAN Y,KRISHNAN D,ISOLA P.Contrastive multiview co-ding[C]//Computer Vision-ECCV 2020:16th European Confe-rence,Glasgow,UK,August 23-28,2020,Proceedings,Part XI 16.Springer International Publishing,2020:776-794. [45]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252. |
|