计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 345-354.doi: 10.11896/jsjkx.230400123

• 信息安全 • 上一篇    下一篇

基于梯度选择的轻量化差分隐私保护联邦学习

王周生1, 杨庚1,2, 戴华1,2   

  1. 1 南京邮电大学计算机学院 南京210023
    2 江苏省大数据安全与智能处理重点实验室 南京210023
  • 收稿日期:2023-04-18 修回日期:2023-09-22 出版日期:2024-01-15 发布日期:2024-01-12
  • 通讯作者: 戴华(daihua@njupt.edu.cn)
  • 作者简介:(2019040109@njupt.edu.cn)
  • 基金资助:
    国家自然科学基金面上项目(61872197,61972209,62372244);江苏省研究生科研与实践创新计划项目(KYCX21_0791)

Lightweight Differential Privacy Federated Learning Based on Gradient Dropout

WANG Zhousheng1, YANG Geng1,2, DAI Hua1,2   

  1. 1 School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
    2 Jiangsu Key Laboratory of Big data Security and Intelligent Processing,Nanjing 210023,China
  • Received:2023-04-18 Revised:2023-09-22 Online:2024-01-15 Published:2024-01-12
  • About author:WANG Zhousheng,born in 1995,Ph.D candidate,is a student member of CCF(No.92685G).His main research in-terests include differential privacy and privacy-preserving machine learning.
    DAI Hua,born in 1982,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.40161M).His main research interests include cloud computing security and privacy protection.
  • Supported by:
    National Natural Science Foundation of China(61872197,61972209,62372244) and Postgraduate Research and Practice Innovation Program of Jiangsu Province(KYCX21_0791).

摘要: 为了应对机器学习过程中可能出现的用户隐私问题,联邦学习作为首个无需用户上传真实数据、仅上传模型更新的协作式在线学习解决方案,已经受到人们的广泛关注与研究。然而,它要求用户在本地训练且上传的模型更新中仍可能包含敏感信息,从而带来了新的隐私保护问题。与此同时,必须在用户本地进行完整训练的特点也使得联邦学习过程中的运算与通信开销问题成为一项挑战,亟需人们建立一种轻量化的联邦学习架构体系。出于进一步的隐私需求考虑,文中使用了带有差分隐私机制的联邦学习框架。另外,首次提出了基于Fisher信息矩阵的Dropout机制——FisherDropout,用于对联邦学习过程中在客户端训练产生梯度更新的每个维度进行优化选择,从而极大地节约运算成本、通信成本以及隐私预算,建立了一种兼具隐私性与轻量化优势的联邦学习框架。在真实世界数据集上的大量实验验证了该方案的有效性。实验结果表明,相比其他联邦学习框架,FisherDropout机制在最好的情况下可以节约76.8%~83.6%的通信开销以及23.0%~26.2%的运算开销,在差分隐私保护中隐私性与可用性的均衡方面同样具有突出优势。

关键词: 联邦学习, 差分隐私, Fisher信息矩阵, Dropout机制, 轻量化

Abstract: To address the privacy issues in the traditional machine learning,federated learning has received widespread attention and research as the first collaborative online learning solution,that does not require users to upload real data but only model updates.However,it requires users to train locally and upload model updates that may still contain sensitive information,which raises new privacy concerns.At the same time,the fact that the complete training must be performed locally by the user makes the computational and communication overheads particularly critical.So,there is also an urgent need for a lightweight federated lear-ning architecture.In this paper,a federated learning framework with differential privacy mechanism is used,for further privacy requirements.In addition,a Fisher information matrix-based Dropout mechanism,FisherDropout,is proposed for the first time for optimal selection of each dimension in the gradients updated by client-side.This mechanism greatly saves computing cost,communication cost,and privacy budget,and establishes a federated learning framework with both privacy and lightweight advantages.Extensive experiments on real-world datasets demonstrate the effectiveness of the scheme.Experimental results show that the FisherDropout mechanism can save 76.8%~83.6% of communication overhead and 23.0%~26.2% of computational overhead in the best case compared with other federated learning frameworks,and also has outstanding advantages in balancing privacy and usability in differential privacy.

Key words: Federated learning, Differential privacy, Fisher information matrix, Dropout, Lightweight

中图分类号: 

  • TP309
[1]MOTHUKURI V,PARIZI R M,POURIYEH S,et al.A survey on security and privacy of federated learning[J].Future Generation Computer Systems,2021,115:619-640.
[2]YANG Q,LIU Y,CHEN T,et al.Federated machine learning:Concept and applications[J].ACM Transactions on Intelligent Systems and Technology(TIST),2019,10(2):1-19.
[3]MCMAHAN B,MOORE E,RAMAGE D,et al.Communication-efficient learning of deep networks from decentralized data[C]//International Conference on Artificial Intelligence and Statistics.PMLR.2017:1273-1282.
[4]MCMAHAN H B,RAMAGE D,TALWAR K,et al.Learning Differentially Private Recurrent Language Models[C]// Proceedings of the International Conference on Learning Representations.2018.
[5]YIN X,ZHU Y,HU J.A comprehensive survey of privacy-preserving federated learning:A taxonomy,review,and future directions[J].ACM Computing Surveys(CSUR),2021,54(6):1-36.
[6]ZHANG F,CHEN Z,ZHANG C,et al.An efficient parallel secure machine learning framework on GPUs[J].IEEE Transactions on Parallel and Distributed Systems,2021,32(9):2262-2276.
[7]ZHANG C,LI S,XIA J,et al.Batchcrypt:Efficient homomorphic encryption for cross-silo federated learning[C]//Procee-dings of the 2020 USENIX Annual Technical Conference(USENIX ATC 2020).2020.
[8]ZHANG G,LIU B,ZHU T,et al.Visual privacy attacks and defenses in deep learning:a survey[J].Artificial Intelligence Review,2022,55(6):4347-4401.
[9]MUNJAL K,BHATIA R.A systematic review of homomorphic encryption and its contributions in healthcare industry[J].Complex & Intelligent Systems,2023,9(4):3759-3786.
[10]DWORK C.Differential privacy:A survey of results[C]//International Conference on Theory and Applications of Models of Computation.2008:1-19.
[11]LIU Y X,CHEN H,LIU Y H,et al.Privacy Preservation Techniques in Federated Learning[J].Journal of Software,2022,33(3):1057-1092.
[12]CALDAS S,KONEČNY J,MCMAHAN H B,et al.Expanding the reach of federated learning by reducing client resource requirements[J].arXiv:1812.07210,2018.
[13]CHENG G,CHARLES Z,GARRETT Z,et al.Does Federated Dropout actually work?[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:3387-3395.
[14]GEYER R C,KLEIN T,NABI M.Differentially private federated learning:A client level perspective[C]//Advances in Neural Information Processing Systems Workshop:Machine Learning on the Phone and other Consumer Devices.2017.
[15]KAIROUZ P,MCMAHAN H B,AVENT B,et al.Advances and open problems in federated learning[J].Foundations and Trends in Machine Learning,2021,14(1/2):1-210.
[16]NGUYEN D C,DING M,PATHIRANA P N,et al.Federated learning for internet of things:A comprehensive survey[J].IEEE Communications Surveys & Tutorials,2021,23(3):1622-1658.
[17]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.2016:308-318.
[18]WEI K,LI J,DING M,et al.Federated learning with differential privacy:Algorithms and performance analysis[J].IEEE Transactions on Information Forensics and Security,2020,15:3454-3469.
[19]REISIZADEH A,MOKHTARI A,HASSANI H,et al.Fedpaq:A communication-efficient federated learning method with pe-riodic averaging and quantization[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2020:2021-2031.
[20]GANAIE M A,HU M,MALIK A K,et al.Ensemble deeplearning:A review[J].Engineering Applications of Artificial Intelligence,2022,115:105151.
[21]KONEČNÝ J,MCMAHAN H B,YU F X,et al.Federated learning:Strategies for improving communication efficiency[J].arXiv:1610.05492,2016.
[22]ROTHCHILD D,PANDA A,ULLAH E,et al.Fetchsgd:Communication-efficient federated learning with sketching[C]//International Conference on Machine Learning.PMLR,2020:8253-8265.
[23]XU H,KOSTOPOULOU K,DUTTA A,et al.DeepReduce:A Sparse-tensor Communication Framework for Federated Deep Learning[C]//Advances in Neural Information Processing Systems(NeurIPS).2021,34:21150-21163.
[24]MARTENS J.New insights and perspectives on the natural gradient method[J].The Journal of Machine Learning Research,2020,21(1):5776-5851.
[25]XU R,LUO F,ZHANG Z,et al.Raise a Child in Large Language Model:Towards Effective and Generalizable Fine-tuning[C]//Proceedings of the International Conference on Empirical Methods in Natural Language Processing.2021:9514-9528.
[26]WANG Q F,GENG X,LIN S X,et al.Learngene:From Open-World to Your Learning Task[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022,36(8):8557-8565.
[27]MIRONOV I.Rényi differential privacy[C]// IEEE 30th Computer Security Foundations Symposium(CSF).IEEE,2017:263-275.
[28]SHAFIQ M,GU Z.Deep residual learning for image recognition:a survey[J].Applied Sciences,2022,12(18):8972.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!