计算机科学 ›› 2024, Vol. 51 ›› Issue (11): 368-378.doi: 10.11896/jsjkx.231100044

• 信息安全 • 上一篇    下一篇

基于更新质量检测和恶意客户端识别的联邦学习模型

雷诚1, 张琳1,2   

  1. 1 南京邮电大学计算机学院 南京 210003
    2 江苏省无线传感网高技术研究重点实验室 南京 210003
  • 收稿日期:2023-11-07 修回日期:2024-04-14 出版日期:2024-11-15 发布日期:2024-11-06
  • 通讯作者: 张琳(zhangl@njupt.edu.cn)
  • 作者简介:(leicheng2021@163.com)
  • 基金资助:
    国家自然科学基金(61872196,61872194);江苏省科技支撑计划(BE2017166);南京邮电大学自然科学基金(NY222142)

Federated Learning Model Based on Update Quality Detection and Malicious Client Identification

LEI Cheng1, ZHANG Lin1,2   

  1. 1 College of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
    2 Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks,Nanjing 210003,China
  • Received:2023-11-07 Revised:2024-04-14 Online:2024-11-15 Published:2024-11-06
  • About author:LEI Cheng,born in 1999,postgraduate.His main research interests include fe-derated learning and privacy protection.
    ZHANG Lin,born in 1980,Ph.D,asso-ciate professor,postgraduate supervisor.Her main research interests include trusted computing,federated learning and privacy protection.
  • Supported by:
    National Natural Science Foundation of China(61872196,61872194),Scientific & Technological Support Project of Jiangsu Province(BE2017166) and Natural Science Foundation of Nanjing University of Posts and Telecommunications(NY222142).

摘要: 作为分布式机器学习,联邦学习缓解了数据孤岛问题,其在不共享本地数据的情况下,仅在服务器和客户端之间传输模型参数,提高了训练数据的隐私性,但也因此使得联邦学习容易遭受恶意客户端的攻击。现有工作主要集中在拦截恶意客户端上传的更新。对此,研究了一种基于更新质量检测和恶意客户端识别的联邦学习模型umFL,以提升全局模型的训练表现和联邦学习的鲁棒性。具体而言,通过获取每一轮客户端训练的损失值来计算客户端更新质量,进行更新质量检测,选择每一轮参与训练的客户端子集,计算更新的本地模型与上一轮全局模型的相似度,从而判定客户端是否做出积极更新,并过滤掉负面更新。同时,引入beta分布函数更新客户端信誉值,将信誉值过低的客户端标记为恶意客户端,拒绝其参与随后的训练。利用卷积神经网络,分别测试了所提算法在MNIST和CIFAR10数据集上的有效性。实验结果表明,在20%~40%恶意客户端的攻击下,所提模型依旧是安全的,尤其是在40%恶意客户端环境下,其相比传统联邦学习在MNIST和CIFAR10上分别提升了40%和20%的模型测试精度,同时分别提升了25.6%和22.8%的模型收敛速度。

关键词: 联邦学习, 客户端更新质量, 客户端信誉值, 恶意客户端识别, 客户端选择

Abstract: As a distributed machine learning,federated learning alleviates the problem of data islands,which only transmits model parameters between the server and the client without sharing local data and improves the privacy of training data,at the same time it also makes federated learning vulnerable to malicious client attacks.The existing research mainly focuses on intercepting updates uploaded by malicious clients.A federated learning model based on update quality detection and malicious client identification method,named umFL,is studied to improve the training performance of global models and the robustness of federated learning.Specifically,the client importance is calculated by obtaining the loss value of each round of client training.The subset of clients participating in each round of training is selected by update quality detection.The similarity between the updated local model and the previous round of global model is calculated to determine whether the client makes positive updates and the negative updates are filtered.Meanwhile,the beta distribution function is introduced to update the client reputation value.The clients with low reputation value are marked as malicious clients and excluded from participating in subsequent training.The effectiveness of the proposed algorithm on MNIST and CIFAR10 datasets is tested by using convolutional neural networks respectively.Experimental results show that under the attack of 20%~40% of malicious clients,the proposed model is still safe.Especially under the 40% malicious clients,the umFL model improves the model testing accuracy by 40% and 20% on MNIST and CIFAR10 respectively compared with traditional federated learning,and the model convergence speed is also improved by 25.6% and 22.8% respectively.

Key words: Federated learning, Client update quality, Client reputation value, Malicious user indentification, Client selection

中图分类号: 

  • TP393
[1] ZHANG P C,JIN H Y.A Privacy-Oriented Prediction Method in Mobile Edge Environment[J].Chinese Journal of Compu-ters,2020,43(8):1555-1571.
[2] MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[3] ZHANG J,CHEN B,CHENG X,et al.Poisongan:Generativepoisoning attacks against federated learning in edge computing systems[J].IEEE Internet of Things Journal,2020,8(5):3310-3322.
[4] LIU Y X,CHEN H,LIU Y H,et al.Privacy Protection Technology in Federated Learning[J].Journal of Software,2022,33(3):1057-1092.
[5] SATTLER F,MÜLLER K R,SAMEK W.Clustered federated learning:Model-agnostic distributed multitask optimization under privacy constraints[J].IEEE transactions on neural networks and learning systems,2020,32(8):3710-3722.
[6] FRABONI Y,VIDAL R,KAMENI L,et al.Clustered sampling:Low-variance and improved representativity for clients selection in federated learning[C]//International Conference on Machine Learning.PMLR,2021:3407-3416.
[7] CHAI Z,ALI A,ZAWAD S,et al.Tifl:A tier-based federated learning system[C]//Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing.2020:125-136.
[8] MHAISEN N,ABDELLATIF A A,MOHAMED A,et al.Optimal user-edge assignment in hierarchical federated learning based on statistical properties and network topology constraints[J].IEEE Transactions on Network Science and Engineering,2021,9(1):55-66.
[9] LI Y,QIN X,CHEN H,et al.Energy-Aware Edge Association for Cluster-based Personalized Federated Learning[J].IEEE Transactions on Vehicular Technology,2022,71(6):6756-6761.
[10] LIU S,YU G,YIN R,et al.Joint Model Pruning and Device Selection for Communication-Efficient Federated Edge Learning[J].IEEE Transactions on Communications,2021,70(1):231-244.
[11] ZOU S,XIAO M,XU Y,et al.FedDCS:Federated LearningFramework based on Dynamic Client Selection[C]//2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems(MASS).IEEE,2021:627-632.
[12] LAI F,ZHU X,MADHYASTHA H V,et al.Oort:Efficientfederated learning via guided participant selection[C]//15th {USENIX} Symposium on Operating Systems Design and Implementation({OSDI} 21).2021:19-35.
[13] WU W,HE L,LIN W,et al.SAFA:A semi-asynchronous protocol for fast federated learning with low overhead[J].IEEE Transactions on Computers,2020,70(5):655-668.
[14] NISHIO T,YONETANI R.Client selection for federated learning with heterogeneous resources in mobile edge[C]//2019 IEEE International Conference on Communications.IEEE,2019:1-7.
[15] ZHAO B,FAN K,YANG K,et al.Anonymous and privacy-preserving federated learning with industrial big data[J].IEEE Transactions on Industrial Informatics,2021,17(9):6314-6323.
[16] ZHAN Z,ZHANG X.Computation-Effective Personalized Fe-derated Learning:A Meta Learning Approach[C]//2023 IEEE 43rd International Conference on Distributed Computing Systems(ICDCS).IEEE,2023:957-958.
[17] FENG W,LIU H,PENG X.Federated Reinforcement Learning for Sharing Experiences Between Multiple Workers[C]//2023 International Conference on Machine Learning and Cybernetics(ICMLC).IEEE,2023:440-445.
[18] LI Y,LIU Z,HUANG Y,et al.FedOES:An Efficient Federated Learning Approach[C]//2023 3rd International Conference on Neural Networks,Information and Communication Engineering(NNICE).IEEE,2023:135-139.
[19] JEONG E,KOUNTOURIS M.Personalized Decentralized Fe-derated Learning with Knowledge Distillation[J].arXiv:2302.12156,2023.
[20] QIN Z,YANG L,WANG Q,et al.Reliable and Interpretable Personalized Federated Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:20422-20431.
[21] SONG Z,SUN H,YANG H H,et al.Reputation-based Federated Learning for Secure Wireless Networks[J].IEEE Internet of Things Journal,2021,9(2):1212-1226.
[22] CAO X,FANG M,LIU J,et al.Fltrust:Byzantine-robust fede-rated learning via trust bootstrapping[J].arXiv:2012.13995,2020.
[23] LIU Y,WANG T,PENG S L,et al.Cleaning and Equipment Clustering Method of Federated Learning Model Based on Edge[J].Chinese Journal of Computers,2021,12:2515-2528.
[24] KATHAROPOULOS A,FLEURET F.Not all samples arecreated equal:Deep learning with importance sampling[C]//International Conference on Machine Learning.PMLR,2018:2525-2534.
[25] ZHAO P,ZHANG T.Stochastic optimization with importance sampling for regularized loss minimization[C]//International Conference on Machine Learning.PMLR,2015:1-9.
[26] SHI C,SHEN C.Federated multi-armed bandits[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:9603-9611.
[27] FANG M,YANG G,GONG N Z,et al.Poisoning attacks tograph-based recommender systems[C]//Proceedings of the 34th Annual Computer Security Applications Conference.2018:381-392.
[28] BIGGIO B,NELSON B,LASKOV P.Poisoning attacks against support vector machines[J].arXiv:1206.6389,2012.
[29] CHEN X,LIU C,LI B,et al.Targeted backdoor attacks on deep learning systems using data poisoning[J].arXiv:1712.05526,2017.
[30] GU T,DOLAN-GAVITT B,GARG S.Badnets:Identifying vulnerabilities in the machine learning model supply chain[J].ar-Xiv:1708.06733,2017.
[31] LIU Y,MA S,AAFER Y,et al.Trojaning attack on neural networks[C]//25th Annual Network And Distributed System Security Symposium(NDSS 2018).Internet Soc,2018.
[32] MUÑOZ-GONZÁLEZ L,BIGGIO B,DEMONTIS A,et al.Towards poisoning of deep learning algorithms with back-gradient optimization[C]//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security.2017:27-38.
[33] SHAFAHI A,HUANG W R,NAJIBI M,et al.Poison frogs!targeted clean-label poisoning attacks on neural networks[J].arXiv:1804.00792,2018.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!