CheatKD:基于毒性神经元同化的知识蒸馏后门攻击方法

doi:10.11896/jsjkx.221200035

计算机科学 ›› 2024, Vol. 51 ›› Issue (3): 351-359.doi: 10.11896/jsjkx.221200035

CheatKD:基于毒性神经元同化的知识蒸馏后门攻击方法

陈晋音^1,2, 李潇¹, 金海波¹, 陈若曦¹, 郑海斌^1,2, 李虎³

1 浙江工业大学信息工程学院杭州310023
2 浙江工业大学网络空间安全研究院杭州310023
3 信息系统安全技术重点实验室北京100101

收稿日期:2022-12-05 修回日期:2023-04-16 出版日期:2024-03-15 发布日期:2024-03-13
通讯作者: 郑海斌(haibinzheng320@gmail.com)
作者简介:(chenjinyin@zjut.edu.cn)
基金资助:
国家自然科学基金(62072406);浙江省自然科学基金(DQ23F020001);信息系统安全技术重点实验室基金(61421110502)

CheatKD:Knowledge Distillation Backdoor Attack Method Based on Poisoned Neuronal Assimilation

CHEN Jinyin^1,2, LI Xiao¹, JIN Haibo¹, CHEN Ruoxi¹, ZHENG Haibin^1,2, LI Hu³

1 College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China
2 Institute of Cyberspace Security,Zhejiang University of Technology,Hangzhou 310023,China
3 Chinese National Key Laboratory of Science and Technology on Information System Security,Beijing 100101,China

Received:2022-12-05 Revised:2023-04-16 Online:2024-03-15 Published:2024-03-13
About author:CHEN Jinyin,born in 1982,Ph.D,professor.Her main research interests include artificial intelligence security,graph data mining and evolutionary computing.ZHENG Haibin,born in 1995,Ph.D,lecturer.His main research interests include deep learning and artificial intelligence security.
Supported by:
National Natural Science Foundation of China(62072406),Natural Science Foundation of Zhejiang Province, China(DQ23F020001) and Chinese National Key Laboratory of Science and Technology on Information System Security(61421110502).

摘要/Abstract

摘要： 深度学习模型性能不断提升,但参数规模也越来越大,阻碍了其在边缘端设备的部署应用。为了解决这一问题,研究者提出了知识蒸馏(Knowledge Distillation,KD)技术,通过转移大型教师模型的“暗知识”快速生成高性能的小型学生模型,从而实现边缘端设备的轻量部署。然而,在实际场景中,许多教师模型是从公共平台下载的,缺乏必要的安全性审查,对知识蒸馏任务造成威胁。为此,我们首次提出针对特征KD的后门攻击方法CheatKD,其嵌入在教师模型中的后门,可以在KD过程中保留并转移至学生模型中,进而间接地使学生模型中毒。具体地,在训练教师模型的过程中,CheatKD初始化一个随机的触发器,并对其进行迭代优化,以控制教师模型中特定蒸馏层的部分神经元(即毒性神经元)的激活值,使其激活值趋于定值,以此实现毒性神经元同化操作,最终使教师模型中毒并携带后门。同时,该后门可以抵御知识蒸馏的过滤被传递到学生模型中。在4个数据集和6个模型组合的实验上,CheatKD取得了85%以上的平均攻击成功率,且对于多种蒸馏方法都具有较好的攻击泛用性。

关键词: 后门攻击, 深度学习, 知识蒸馏, 鲁棒性

Abstract: With the continuous performance improvement of deep neural networks(DNNs),their parameter scale is also growing sharply,which hinders the deployment and application of DNNs on edge devices.To solve this problem,researchers propose knowledge distillation(KD).Small student models with high performance can be generated from KD,by learning the “dark knowledge” of large teacher models,realizing easy deployment of DNNs on edge devices.However,in the actual scenario,users often download large models from public model repositories,which lacks the guarantee of security.This may pose a severe threat to KD tasks.This paper proposes a backdoor attack for feature KD,named CheatKD,whose backdoor,embedded in the teacher model,can be retained and transferred to the student model during KD,and then indirectly poison the student model.Specifically,in the process of training the teacher model,CheatKD initializes a random trigger and optimizes it to control the activation values of some certain neurons of a particular distillation layer in the teacher model(i.e.,poisoned neuron),making their activation va-lues fixed to enable poisoned neuronal assimilation.As the result,the teacher model is backdoored while this backdoor can resist to KD filtration and be transferred to the student model.Extensive experiment on four datasets and six model pairs have verified that CheatKD achieves an average attack success rate of 85.7%.Besides,it has good generality for various distillation methods.

Key words: Backdoor attack, Deep learning, Knowledge distillation, Robustness

中图分类号:

TP391

陈晋音, 李潇, 金海波, 陈若曦, 郑海斌, 李虎. CheatKD:基于毒性神经元同化的知识蒸馏后门攻击方法[J]. 计算机科学, 2024, 51(3): 351-359. https://doi.org/10.11896/jsjkx.221200035

CHEN Jinyin, LI Xiao, JIN Haibo, CHEN Ruoxi, ZHENG Haibin, LI Hu. CheatKD:Knowledge Distillation Backdoor Attack Method Based on Poisoned Neuronal Assimilation[J]. Computer Science, 2024, 51(3): 351-359. https://doi.org/10.11896/jsjkx.221200035

参考文献

[1]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:770-778.
[2]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:7132-7141.
[3]MA N,ZHANG X,ZHENG H T,et al.Shufflenet v2:Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:116-131.
[4]HE K,GKIOXARI G,DOLLÁR P,et al.Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[5]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].arXiv:1506.01497,2015
[6]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[7]CHEN J,BAI T.SAANet:Spatial adaptive alignment network for object detection in automatic driving[J].Image and Vision Computing,2020,94:103873.
[8]REN K,WANG Q,WANG C,et al.The security of autonomous driving:Threats,defenses,and future directions[J].Proceedings of the IEEE,2019,108(2):357-372.
[9]LI S,XU X Z.VGG16 optimization method based on double-angle parallel pruning[J].Computer Science,2021,48(6):227-233.
[10]SUN Y L,YE T Y.Convolutional neural network compression method based on pruning and quantization[J].Computer Science,2020,47(8):261-266.
[11]CHENG X M,DENG C H.Compression algorithm based on face recognition model based on label-free knowledge distillation[J].Computer Science,2022,49(6):245-253.
[12]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015.
[13]TANG W L,CENG J Y,HE W J.A lightweight model for orbital detection based on knowledge distillation[J].Journal of Chongqing University of Technology(Natural Science),2023,37(9):173-179.
[14]HUANG B,WANG Y H,ZHAO Y,et al.Intrusion Detection Model Incorporating MS-IRB and CAM[J].Journal of Chinese Computer Systems,2023,44(7):1586-1592.
[15]LIU Z,LI F,LI Z,et al.LoneNeuron:A Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks[C]//Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security.2022:2129-2143.
[16]GE Y,WANG Q,ZHENG B,et al.Anti-distillation backdoor attacks:Backdoors can really survive in knowledge distillation[C]//Proceedings of the 29th ACM International Conference on Multimedia.2021:826-834.
[17]GU T,DOLAN-GAVITT B,GARG S.Badnets:Identifying vulnerabilities in the machine learning model supply chain[J].ar-Xiv:1708.06733,2017.
[18]LIU Y Q,MA S Q,AAFER Y,et al.Trojaning attack on neural networks[C]//Proceedings of the Network and Distributed System Security Symposium.2017.
[19]LI X H,ZHENG H B,CHEN J Y,et al.Neural Path Poisoning Attack Method for Federated Learning[J].Journal of Chinese Computer Systems,2023,44(7):1578-1585.
[20]YAO Y,LI H,ZHENG H,et al.Latent backdoor attacks on deep neural networks[C]//Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security.2019:2041-2055.
[21]YOSHIDA K,FUJINO T.Countermeasure against backdoor attack on neural networks utilizing knowledge distillation[J].Journal of Signal Processing,2020,24(4):141-144.
[22]KRIZHEVSKY A,HINTON G.Learning multiple layers of fea-tures from tiny images.Technical report[EB/OL].(2018-04-08).https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[23]ZAGORUYKO S,KOMODAKIS N.Wide residual networks[J].arXiv:1605.07146,2016.
[24]VAN DER MAATEN L,HINTON G.Visualizing data using t-SNE[J].Journal of Machine Learning Research,2008,9(11):2579-2605.
[25]WANG B,YAO Y,SHAN S,et al.Neural cleanse:Identifying and mitigating backdoor attacks in neural networks[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:707-723.
[26]ZHANG Y,XIANG T,HOSPEDALES T M,et al.Deep mutual learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4320-4328.
[27]MIRZADEH S I,FARAJTABAR M,LI A,et al.Improvedknowledge distillation via teacher assistant[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(4):5191-5198.
[28]LEE S H,KIM D H,SONG B C.Self-supervised knowledge distillation using singular value decomposition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:335-350.
[29]PENG B,JIN X,LIU J,et al.Correlation congruence for know-ledge distillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:5007-5016.
[30]ROMERO A,BALLAS N,KAHOU S E,et al.Fitnets:Hints for thin deep nets[J].arXiv:1412.6550,2014.
[31]YIM J,JOO D,BAE J,et al.A gift from knowledge distillation:Fast optimization,network minimization and transfer learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4133-4141.
[32]HEO B,LEE M,YUN S,et al.Knowledge transfer via distillation of activation boundaries formed by hidden neurons[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:3779-3787.
[33]HEO B,KIM J,YUN S,et al.A comprehensive overhaul of feature distillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1921-1930.
[34]SAHA A,SUBRAMANYA A,PIRSIAVASH H.Hidden trigger backdoor attacks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(7):11957-11965.
[35]ZHANG Z,XIAO G,LI Y,et al.Red alarm for pre-trained mo-dels:Universal vulnerability to neuron-level backdoor attacks[J].arXiv:2101.06969,2021.
[36]WANG Y,FAN W,YANG K,et al.A Knowledge Distillation-Based Backdoor Attack in Federated Learning[J].arXiv:2208.06176,2022.
[37]SRINIVAS S,FLEURET F.Knowledge transfer with jacobian matching[C]//International Conference on Machine Learning.2018:4723-4731.
[38]STALLKAMP J,SCHLIPSING M,SALMEN J,et al.Man vs.computer:benchmarking machine learning algorithms for traffic sign recognition[J].Neural Network,2012;32:323-332.
[39]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[40]HAN D,KIM J,KIM J.Deep pyramidal residual networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5927-5935.
[41]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[42]LI Y,LYU X,KOREN N,et al.Neural attention distillation:Erasing backdoor triggers from deep neural networks[J].arXiv:2101.05930,2021.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

CheatKD:基于毒性神经元同化的知识蒸馏后门攻击方法

CheatKD:Knowledge Distillation Backdoor Attack Method Based on Poisoned Neuronal Assimilation

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0