计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 310-323.doi: 10.11896/jsjkx.211000079

• 信息安全 • 上一篇    下一篇

联邦学习攻防研究综述

陈明鑫1,2, 张钧波1,2, 李天瑞1   

  1. 1 西南交通大学计算机与人工智能学院 成都611756
    2 京东城市(北京)数字科技有限公司 北京100098
  • 收稿日期:2021-10-12 修回日期:2022-01-29 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 张钧波(msjunbozhang@outlook.com)
  • 作者简介:(mxchen1997@gmail.com)
  • 基金资助:
    国家自然科学基金(72061127001);北京市自然科学基金(4212021);北京市科技新星项目(Z201100006820053)

Survey on Attacks and Defenses in Federated Learning

CHEN Ming-xin1,2, ZHANG Jun-bo1,2, LI Tian-rui1   

  1. 1 School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
    2 JD iCity,JD Technology,Beijing 100098,China
  • Received:2021-10-12 Revised:2022-01-29 Online:2022-07-15 Published:2022-07-12
  • About author:CHEN Ming-xin,born in 1997,postgraduate,is a member of China Computer Federation.His main research interests include federated learning and so on.
    ZHANG Jun-bo,born in 1986,Ph.D,is a senior member of China Computer Fe-deration.His main research interests include spatio-temporal data mining,urban computing and federated learning.
  • Supported by:
    National Natural Science Foundation of China(72061127001),Natural Science Foundation of Beijing,China(4212021) and Beijing Nova Program(Z201100006820053).

摘要: 联邦学习用于解决数据共享与隐私安全之间的矛盾,旨在通过安全地交互不可逆的信息(如模型参数或梯度更新)来构建一个联邦模型。然而,联邦学习在模型的本地训练、信息交互、参数传递等过程中依然存在恶意攻击和隐私泄漏的风险,这给联邦学习的实际应用带来了重大挑战。文中针对联邦学习在建模和部署过程中存在的攻击行为及相应的防御策略进行了详细调研。首先,简要介绍了联邦学习的基本流程和相关攻防知识;接着,从机密性、可用性和正直性3个角度对联邦学习训练和部署中的攻击行为进行了分类,并梳理了相关的隐私窃取和恶意攻击的最新研究;然后,从防御诚实但好奇(honest-but-curious)攻击者和恶意攻击者两个方向对防御方法进行了划分,并分析了不同策略的防御能力;最后,总结了防御方法在联邦学习实践中存在的问题及可能导致的攻击风险,并探讨了联邦系统的防御策略在未来的发展方向。

关键词: 攻击与防御, 机器学习, 联邦学习, 数据安全, 数据融合

Abstract: Federated learning is proposed to solve the contradiction between data sharing and privacy-preserving.It aims to build collaborative models by securely interacting irreversible information (e.g.,model parameters or gradient updates).However,the risks of privacy leakage and malicious attacks in the process of model local training,information interaction and parameter transmission have brought major challenges to the practical application of federated learning.This paper summarizes the Attack beha-viors and corresponding defense strategies in the modeling and deployment process of federated learning.Firstly,this paper briefly reviews the development process of federated learning and the basic modeling process.Next,it classifies attack behaviors in fede-ral learning training and deployment from three aspects:confidentiality,availability and integrity,and combs the latest research on privacy theft and malicious attacks.Then,it summarizes defense countermeasures from two directions: honest-but-curious attac-kers and malicious attackers,and analyzes the defense capabilities of different strategies.Finally,it presents some discussions about the problems and challenges of attack and defense methods in the practice of federated learning.Besides,it looks forward to their future development direction of federated learning in defense strategy and system design.

Key words: Attacks and Defenses, Data fusion, Data security, Federated learning, Machine learning

中图分类号: 

  • TP391
[1]LI T,SAHU A K,TALWALKAR A,et al.Federated learning:Challenges,methods,and future directions[J].IEEE Signal Processing Magazine,2020,37(3):50-60.
[2]GOODMAN B,FLAXMAN S.European Union regulations onalgorithmic decision-making and a “rightto explanation”[J].AI Magazine,2017,38(3):50-57.
[3]GANJU K,WANG Q,YANG W,et al.Property inference attacks on fully connected neural networks using permutation invariant representations[C]//Proceedings of ACM SIGSAC.2018:619-633.
[4]ATENIESE G,MANCINIL V,SPOGNARDI A,et al.Hacking smart machines with smarter ones:How to extract meaningful data from machine learning classifiers[J].International Journal of Security and Networks,2015,10(3):137-150.
[5]MELIS L,SONG C,DE CRISTOFARO E,et al.Exploiting unintended feature leakage in collaborative learning[C]//IEEE Symposium on Security and Privacy.2019:691-706.
[6]SHOKRI R,STRONATI M,SONG C,et al.Membership inference attacks against machine learning models[C]//IEEE Symposium on Security and Privacy.2017:3-18.
[7]FREDRIKSON M,JHA S,RISTENPART T.Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proceedings of ACM SIGSAC.2015:1322-1333.
[8]HITAJ B,ATENIESE G,PEREZ-CRUZ F.Deep models under the GAN:information leakage from collaborative deep learning[C]//Proceedings of ACM SIGSAC.2017:603-618.
[9]TRAMÉR F,ZHANG F,JUELS A,et al.Stealing machinelearning models via prediction apis[C]//USENIX Security.2016:601-618.
[10]PAPERNOT N,MCDANIEL P,SINHA A,et al.Sok:Secu-rityand privacy in machine learning[C]//IEEE European Symposium on Security and Privacy.2018:399-414.
[11]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[C]//ICLR,2014:1-11.
[12]GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014.
[13]LI Y,WU B,JIANG Y,et al.Backdoor learning:A survey[J].arXiv:2007.08745,2020.
[14]WANG H,SREENIVASAN K,RAJPUT S,et al.Attack of the tails:Yes,you really can backdoor federated learning[J].arXiv:2007.05084,2020.
[15]SUN Z,KAIROUZ P,SURESH A T,et al.Can you really backdoor federated learning?[J].arXiv:1911.07963,2019.
[16]CHEN D,FU A,ZHOU C,et al.Federated Learning Backdoor Attack Scheme Based on Generative Adversarial Network[J].Journal of Computer Research and Development,2021,58(11):2364-2373.
[17]GARBER L.Denial-of-service attacks rip the Internet[J].Computer,2000,33(4):12-17.
[18]GU T,LIU K,DOLAN-GAVITT B,et al.Badnets:Evaluating backdooring attacks on deep neural networks[J].IEEE Access,2019,7:47230-47244.
[19]RAKIN A S,HE Z,FAN D.Tbt:Targeted neural network attack with bit trojan[C]//Proceedings of the IEEE Conference on CVPR.2020:13198-13207.
[20]XU R,BARACALDO N,ZHOU Y,et al.Hybridalpha:An efficient approach for privacy-preserving federated learning[C]//Workshop on Artificial Intelligence and Security.2019:13-23.
[21]AONO Y,HAYASHI T,WANG L,et al.Privacy-preservingdeep learning via additively homomorphic encryption[J].IEEE Transactions on Information Forensics and Security,2017,13(5):1333-1345.
[22]GEYER R C,KLEIN T,NABI M.Differentially private federated learning:A client level perspective[J].arXiv:1712.07557,2017.
[23]MCMAHAN H B,RAMAGE D,TALWAR K,et al.Learning differentially private recurrent language models[C]// ICLR.2017:171-182.
[24]TRUEX S,BARACALDO N,ANWAR A,et al.A hybrid approach to privacy-preserving federated learning[C]// Workshop on Artificial Intelligence and Security.2019:1-11.
[25]LIU Y,XIE Y,SRIVASTAVA A.Neural trojans[C]//International Conference on Computer Design.2017:45-48.
[26]TRAN B,LI J,MADRY A.Spectral signatures in backdoor attacks[J].arXiv:1811.00636,2018.
[27]WANG Z,SONG M,ZHANG Z,et al.Beyond inferring class representatives:User-level privacy leakage from federated lear-ning[C]//IEEE INFOCOM.2019:2512-2520.
[28]FREDRIKSON M,LANTZ E,JHA S,et al.Privacy in pharmacogenetics:An end-to-end case study of personalized warfarin dosing[C]//USENIX Security.2014:17-32.
[29]GEIPING J,BAUERMEISTER H,DRÖGE H,et al.Inverting Gradients--How easy isit to break privacy in federated learning?[C]//Advances in Neural Information Processing Systems.2020:16937-16947.
[30]ZHU L,HAN S.Deep leakage from gradients[C]//Advances in Neural Information Processing Systems.2019:17-31.
[31]PAPERNOT N,MCDANIEL P,GOODFELLOW I,et al.Practical black-box attacks against machine learning[C]//Procee-dings of ACM on Asia Conference on CCS.2017:506-519.
[32]JUUTI M,SZYLLER S,MARCHAL S,et al.PRADA:protecting against DNN model stealing attacks[C]//IEEE European Symposium on Security and Privacy.2019:512-527.
[33]OREKONDY T,SCHIELE B,FRITZ M.Knockoff nets:Stea-ling functionality of black-box models[C]//Proceedings of IEEE Conference on CVPR.2019:4954-4963.
[34]BHAGOJI A N,CHAKRABORTY S,MITTAL P,et al.Analyzing federated learning through an adversarial lens[C]//International Conference on Machine Learning.2019:634-643.
[35]TANG R,DU M,LIU N,et al.An embarrassingly simple approach for trojan attack in deep neural networks[C]//Procee-dings of ACM SIGKDD.2020:218-228.
[36]DUMFORD J,SCHEIRER W.Backdooring convolutional neural networks via targeted weight perturbations[C]//IEEE International Joint Conference on Biometrics.2020:1-9.
[37]BAGDASARYAN E,VEIT A,HUA Y,et al.How to backdoor federated learning[C]//International Conference on Artificial Intelligence and Statistics.2020:2938-2948.
[38]LIU Y,MA S,AAFER Y,et al.Trojaning attack on neural networks[C]//Proceedings Network and Distributed System Security Symposium.2018:1-11.
[39]CHEN X,LIU C,LI B,et al.Targeted backdoor attacks on deep learning systems using data poisoning[J].arXiv:1712.05526,2017.
[40]SU J,VARGAS D V,SAKURAI K.One pixel attack for fooling deep neural networks[J].IEEE Transactions on Evolutionary Computation,2019,23(5):828-841.
[41]FANG M,CAO X,JIA J,et al.Local modelpoisoning attacks to byzantine-robust federated learning[C]//USENIX Security.2020:1605-1622.
[42]XIE C,KOYEJO O,GUPTA I.Fall of empires:Breaking Byzantine-tolerant SGD by inner product manipulation[C]//Uncertainty in Artificial Intelligence.2020:261-270.
[43]BLANCHARD P,EL MHAMDI E M,GUERRAOUI R,et al.Machine learning with adversaries:Byzantine tolerant gradient descent[C]//Neural Information Processing Systems.2017:118-128.
[44]FUNG C,YOON C J M,BESCHASTNIKH I.The limitations of federated learning in sybil settings[C]//International Sympo-sium on Research in Attacks.2020:301-316.
[45]BHOWMICK A,DUCHI J,FREUDIGER J,et al.Protectionagainst reconstruction and its applications in private federated learning[J].arXiv:1812.00984,2018.
[46]SHOKRI R,SHMATIKOV V.Privacy-preserving deep learning[C]//Proceedings of ACM SIGSAC.2015:1310-1321.
[47]LIN Y,HAN S,MAO H,et al.Deep gradient compression:Reducing the communication bandwidth for distributed training[C]//ICLR.2017:1-12.
[48]AGRAWAL N,SHAHIN SHAMSABADI A,KUSNER M J,et al.QUOTIENT:two-party secure neural network training and prediction[C]//Proceedings of ACM SIGSAC.2019:1231-1247.
[49]ZHANG C,LI S,XIA J,et al.Batchcrypt:Efficient homomorphic encryption for cross-silo federated learning[C]//USENIX Security.2020:493-506.
[50]LIU C,CHAKRABORTY S,VERMA D.Secure model fusionfor distributed learning using partial homomorphic encryption[J].Policy-Based Autonomic Data Governance,2019(11550):154-179.
[51]HARDY S,HENECKA W,IVEY-LAW H,et al.Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption[J].arXiv:1711.10677,2017.
[52]HAO M,LI H,XU G,et al.Towards efficient and privacy-preserving federated deep learning[C]//IEEE International Confe-rence on Communications.2019:1-6.
[53]SAMANGOUEI P,KABKAB M,CHELLAPPA R.Defense-gan:Protecting classifiers against adversarial attacks using ge-nerative models[C]//ICLR.2018:185-195.
[54]UDESHI S,PENG S,WOO G,et al.Model agnostic defenceagainst backdoor attacks in machinelearning[J].arXiv:1908.02203,2019.
[55]GUO C,RANA M,CISSE M,et al.Countering adversarial images using input transformations[C]//ICLR.2017:1-13.
[56]DZIUGAITE G K,GHAHRAMANI Z,ROY D M.A study of the effect of jpg compression on adversarial images[J].arXiv:1608.00853,2016.
[57]HUANG R,XU B,SCHUURMANS D,et al.Learning with a strong adversary[C]//ICLR.2017:151-161.
[58]FUNG C,YOON C J M,BESCHASTNIKH I.Mitigating sybils in federated learning poisoning[J].arXiv:1808.04866,2018.
[59]SHEN S,TOPLE S,SAXENA P.Auror:Defending against poisoning attacks in collaborative deep learning systems[C]//Annual Conference on Computer Security Applications.2016:508-519.
[60]YIN D,CHEN Y,KANNAN R,et al.Byzantine-robust distributed learning:Towards optimal statistical rates[C]//InternationalConference on Machine Learning.2018:5650-5659.
[61]VILLARREAL-VASQUEZ M,BHARGAVA B.Confoc:Con-tent-focus protection against trojan attacks on neural networks[J].arXiv:2007.00711,2020.
[62]LIU K,DOLAN-GAVITT B,GARG S.Fine-pruning:Defendingagainst backdooring attacks on deep neural networks[C]//International Symposium on Research in Attacks,Intrusions,and Defenses.2018:273-294.
[63]WANG B,YAO Y,SHAN S,et al.Neural cleanse:Identifying and mitigating backdoor attacks in neural networks[C]//IEEE Symposium on Security and Privacy.2019:707-723.
[64]WANG J Z,KONG L W,HUANG Z C,et al.Research advances on privacy protection of federated learning[J].Big data research,2021,7(3):139-149.
[65]CHEN B,XIE Y Y,ZHANG J L,et al.Survey of security and privacy in federated learning[J].Journal of NUAA,2020,52(5):675-684.
[66]YANG Q.AI and Data Privacy Protection:The Way to Federated Learning[J].Journal of Information Security Research,2019,5(11):961-965.
[67]LI T,SAHU A K,ZAHEER M,et al.Federated optimization in heterogeneous networks[C]//Proceedings of Machine Learning and Systems.2020:429-450.
[68]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-efficient learning of deep networks from decentralized data[C]//Artificial intelligence and statistics.2017:1273-1282.
[69]DWORK C,NAOR M.On the difficulties of disclosure prevention in statistical databases or the case for differential privacy[J].Journal of Privacy and Confidentiality,2010,2(1):1-12.
[70]DWORK C.Differential privacy:A survey of results[C]//International conference on theory and applications of models of computation.2008:1-19.
[71]BONAWITZ K,IVANOV V,KREUTER B,et al.Practical secure aggregation for privacy-preserving machine learning[C]//Proceedings of ACM SIGSAC.2017:1175-1191.
[72]YAO A C.Protocols for secure computations[C]//Annual Symposium on Foundations of Computer Science.1982:160-164.
[73]LIU D C,NOCEDAL J.On the limited memory BFGS method for large scale optimization[J].Mathematical Programming,1989,45(1):503-528.
[74]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of IEEE Conference on CVPR.2016:770-778.
[75]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//ICLR.2015:1-13.
[76]PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//IEEE Symposium on Security and Privacy.2016:582-597.
[77]BONAWITZ K,EICHNER H,GRIESKAMP W,et al.Towards federated learning at scale:System design[C]//IEEE CVPR Workshop.2019:3310-3319.
[78]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends in Theoretical Compu-ter Science,2014,9(3/4):211-407.
[79]FRIEDMAN A,SCHUSTER A.Data mining with differential privacy[C]//Proceedings of ACM SIGKDD.2010:493-502.
[80]ZHANG J,ZHANG Z,XIAO X,et al.Functional mechanism:regression analysis under differential privacy[J].arXiv:1208.0219,2012.
[81]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of ACM SIGSAC.2016:308-318.
[82]LI R J,JIA C F,WANG Y F.Multi-key homomorphic proxy re-encryption scheme based on NTRU and its application[J].Journal on Communications,2021,42(3):11-22.
[83]ARMKNECHT F,BOYD C,CARR C,et al.A Guide to Fully Homomorphic Encryption[J].IACR Cryptology EPrint Archive,2015(1192):1-35.
[84]PAILLIER P.Public-key cryptosystems based on composite degree residuosity classes[C]//International Conference on the Theory and Applications of Cryptographic Techniques.1999:223-238.
[85]LINDNER R,PEIKERT C.Better key sizes (and attacks) for LWE-based encryption[C]//Cryptographers’ Track at the RSA Conference.2011:319-339.
[86]DAMGÅRD I,JURIK M.A generalisation,a simplification and some applications of Paillier’s probabilistic public-key system[C]//International Workshop on Public Key Cryptography.2001:119-136.
[87]DOAN B G,ABBASNEJAD E,RANASINGHE D C.Februus:Input purification defense against trojan attacks on deep neural network systems[C]//Annual Computer Security Applications Conference.2020:897-912.
[88]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of IEEE Conference on ICCV.2017:618-626.
[89]HUANG X,ALZANTOT M,SRIVASTAVA M.Neuronin-spect:Detecting backdoors in neural networks via output explanations[J].arXiv:1911.07399,2019.
[90]CHOQUETTE-CHOO C A,DULLERUD N,DZIEDZIC A,et al.CaPC Learning:Confidential and Private Collaborative Learning[C]//ICLR.2021:212-223.
[91]ADI Y,BAUM C,CISSE M,et al.Turning your weakness into astrength:Watermarking deep neural networks by backdooring [C]//USENIX Security.2018:1615-1631.
[1] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[2] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[3] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[4] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[5] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[6] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[7] 张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究
Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[8] 李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述
Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[9] 赵璐, 袁立明, 郝琨.
多示例学习算法综述
Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[10] 肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究
Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[11] 闫萌, 林英, 聂志深, 曹一凡, 皮欢, 张兰.
一种提高联邦学习模型鲁棒性的训练方法
Training Method to Improve Robustness of Federated Learning
计算机科学, 2022, 49(6A): 496-501. https://doi.org/10.11896/jsjkx.210400298
[12] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[13] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于DBSCAN聚类的集群联邦学习方法
Clustered Federated Learning Methods Based on DBSCAN Clustering
计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059
[14] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[15] 许杰, 祝玉坤, 邢春晓.
机器学习在金融资产定价中的应用研究综述
Application of Machine Learning in Financial Asset Pricing:A Review
计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!