计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 310-323.doi: 10.11896/jsjkx.211000079
陈明鑫1,2, 张钧波1,2, 李天瑞1
CHEN Ming-xin1,2, ZHANG Jun-bo1,2, LI Tian-rui1
摘要: 联邦学习用于解决数据共享与隐私安全之间的矛盾,旨在通过安全地交互不可逆的信息(如模型参数或梯度更新)来构建一个联邦模型。然而,联邦学习在模型的本地训练、信息交互、参数传递等过程中依然存在恶意攻击和隐私泄漏的风险,这给联邦学习的实际应用带来了重大挑战。文中针对联邦学习在建模和部署过程中存在的攻击行为及相应的防御策略进行了详细调研。首先,简要介绍了联邦学习的基本流程和相关攻防知识;接着,从机密性、可用性和正直性3个角度对联邦学习训练和部署中的攻击行为进行了分类,并梳理了相关的隐私窃取和恶意攻击的最新研究;然后,从防御诚实但好奇(honest-but-curious)攻击者和恶意攻击者两个方向对防御方法进行了划分,并分析了不同策略的防御能力;最后,总结了防御方法在联邦学习实践中存在的问题及可能导致的攻击风险,并探讨了联邦系统的防御策略在未来的发展方向。
中图分类号:
[1]LI T,SAHU A K,TALWALKAR A,et al.Federated learning:Challenges,methods,and future directions[J].IEEE Signal Processing Magazine,2020,37(3):50-60. [2]GOODMAN B,FLAXMAN S.European Union regulations onalgorithmic decision-making and a “rightto explanation”[J].AI Magazine,2017,38(3):50-57. [3]GANJU K,WANG Q,YANG W,et al.Property inference attacks on fully connected neural networks using permutation invariant representations[C]//Proceedings of ACM SIGSAC.2018:619-633. [4]ATENIESE G,MANCINIL V,SPOGNARDI A,et al.Hacking smart machines with smarter ones:How to extract meaningful data from machine learning classifiers[J].International Journal of Security and Networks,2015,10(3):137-150. [5]MELIS L,SONG C,DE CRISTOFARO E,et al.Exploiting unintended feature leakage in collaborative learning[C]//IEEE Symposium on Security and Privacy.2019:691-706. [6]SHOKRI R,STRONATI M,SONG C,et al.Membership inference attacks against machine learning models[C]//IEEE Symposium on Security and Privacy.2017:3-18. [7]FREDRIKSON M,JHA S,RISTENPART T.Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proceedings of ACM SIGSAC.2015:1322-1333. [8]HITAJ B,ATENIESE G,PEREZ-CRUZ F.Deep models under the GAN:information leakage from collaborative deep learning[C]//Proceedings of ACM SIGSAC.2017:603-618. [9]TRAMÉR F,ZHANG F,JUELS A,et al.Stealing machinelearning models via prediction apis[C]//USENIX Security.2016:601-618. [10]PAPERNOT N,MCDANIEL P,SINHA A,et al.Sok:Secu-rityand privacy in machine learning[C]//IEEE European Symposium on Security and Privacy.2018:399-414. [11]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[C]//ICLR,2014:1-11. [12]GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014. [13]LI Y,WU B,JIANG Y,et al.Backdoor learning:A survey[J].arXiv:2007.08745,2020. [14]WANG H,SREENIVASAN K,RAJPUT S,et al.Attack of the tails:Yes,you really can backdoor federated learning[J].arXiv:2007.05084,2020. [15]SUN Z,KAIROUZ P,SURESH A T,et al.Can you really backdoor federated learning?[J].arXiv:1911.07963,2019. [16]CHEN D,FU A,ZHOU C,et al.Federated Learning Backdoor Attack Scheme Based on Generative Adversarial Network[J].Journal of Computer Research and Development,2021,58(11):2364-2373. [17]GARBER L.Denial-of-service attacks rip the Internet[J].Computer,2000,33(4):12-17. [18]GU T,LIU K,DOLAN-GAVITT B,et al.Badnets:Evaluating backdooring attacks on deep neural networks[J].IEEE Access,2019,7:47230-47244. [19]RAKIN A S,HE Z,FAN D.Tbt:Targeted neural network attack with bit trojan[C]//Proceedings of the IEEE Conference on CVPR.2020:13198-13207. [20]XU R,BARACALDO N,ZHOU Y,et al.Hybridalpha:An efficient approach for privacy-preserving federated learning[C]//Workshop on Artificial Intelligence and Security.2019:13-23. [21]AONO Y,HAYASHI T,WANG L,et al.Privacy-preservingdeep learning via additively homomorphic encryption[J].IEEE Transactions on Information Forensics and Security,2017,13(5):1333-1345. [22]GEYER R C,KLEIN T,NABI M.Differentially private federated learning:A client level perspective[J].arXiv:1712.07557,2017. [23]MCMAHAN H B,RAMAGE D,TALWAR K,et al.Learning differentially private recurrent language models[C]// ICLR.2017:171-182. [24]TRUEX S,BARACALDO N,ANWAR A,et al.A hybrid approach to privacy-preserving federated learning[C]// Workshop on Artificial Intelligence and Security.2019:1-11. [25]LIU Y,XIE Y,SRIVASTAVA A.Neural trojans[C]//International Conference on Computer Design.2017:45-48. [26]TRAN B,LI J,MADRY A.Spectral signatures in backdoor attacks[J].arXiv:1811.00636,2018. [27]WANG Z,SONG M,ZHANG Z,et al.Beyond inferring class representatives:User-level privacy leakage from federated lear-ning[C]//IEEE INFOCOM.2019:2512-2520. [28]FREDRIKSON M,LANTZ E,JHA S,et al.Privacy in pharmacogenetics:An end-to-end case study of personalized warfarin dosing[C]//USENIX Security.2014:17-32. [29]GEIPING J,BAUERMEISTER H,DRÖGE H,et al.Inverting Gradients--How easy isit to break privacy in federated learning?[C]//Advances in Neural Information Processing Systems.2020:16937-16947. [30]ZHU L,HAN S.Deep leakage from gradients[C]//Advances in Neural Information Processing Systems.2019:17-31. [31]PAPERNOT N,MCDANIEL P,GOODFELLOW I,et al.Practical black-box attacks against machine learning[C]//Procee-dings of ACM on Asia Conference on CCS.2017:506-519. [32]JUUTI M,SZYLLER S,MARCHAL S,et al.PRADA:protecting against DNN model stealing attacks[C]//IEEE European Symposium on Security and Privacy.2019:512-527. [33]OREKONDY T,SCHIELE B,FRITZ M.Knockoff nets:Stea-ling functionality of black-box models[C]//Proceedings of IEEE Conference on CVPR.2019:4954-4963. [34]BHAGOJI A N,CHAKRABORTY S,MITTAL P,et al.Analyzing federated learning through an adversarial lens[C]//International Conference on Machine Learning.2019:634-643. [35]TANG R,DU M,LIU N,et al.An embarrassingly simple approach for trojan attack in deep neural networks[C]//Procee-dings of ACM SIGKDD.2020:218-228. [36]DUMFORD J,SCHEIRER W.Backdooring convolutional neural networks via targeted weight perturbations[C]//IEEE International Joint Conference on Biometrics.2020:1-9. [37]BAGDASARYAN E,VEIT A,HUA Y,et al.How to backdoor federated learning[C]//International Conference on Artificial Intelligence and Statistics.2020:2938-2948. [38]LIU Y,MA S,AAFER Y,et al.Trojaning attack on neural networks[C]//Proceedings Network and Distributed System Security Symposium.2018:1-11. [39]CHEN X,LIU C,LI B,et al.Targeted backdoor attacks on deep learning systems using data poisoning[J].arXiv:1712.05526,2017. [40]SU J,VARGAS D V,SAKURAI K.One pixel attack for fooling deep neural networks[J].IEEE Transactions on Evolutionary Computation,2019,23(5):828-841. [41]FANG M,CAO X,JIA J,et al.Local modelpoisoning attacks to byzantine-robust federated learning[C]//USENIX Security.2020:1605-1622. [42]XIE C,KOYEJO O,GUPTA I.Fall of empires:Breaking Byzantine-tolerant SGD by inner product manipulation[C]//Uncertainty in Artificial Intelligence.2020:261-270. [43]BLANCHARD P,EL MHAMDI E M,GUERRAOUI R,et al.Machine learning with adversaries:Byzantine tolerant gradient descent[C]//Neural Information Processing Systems.2017:118-128. [44]FUNG C,YOON C J M,BESCHASTNIKH I.The limitations of federated learning in sybil settings[C]//International Sympo-sium on Research in Attacks.2020:301-316. [45]BHOWMICK A,DUCHI J,FREUDIGER J,et al.Protectionagainst reconstruction and its applications in private federated learning[J].arXiv:1812.00984,2018. [46]SHOKRI R,SHMATIKOV V.Privacy-preserving deep learning[C]//Proceedings of ACM SIGSAC.2015:1310-1321. [47]LIN Y,HAN S,MAO H,et al.Deep gradient compression:Reducing the communication bandwidth for distributed training[C]//ICLR.2017:1-12. [48]AGRAWAL N,SHAHIN SHAMSABADI A,KUSNER M J,et al.QUOTIENT:two-party secure neural network training and prediction[C]//Proceedings of ACM SIGSAC.2019:1231-1247. [49]ZHANG C,LI S,XIA J,et al.Batchcrypt:Efficient homomorphic encryption for cross-silo federated learning[C]//USENIX Security.2020:493-506. [50]LIU C,CHAKRABORTY S,VERMA D.Secure model fusionfor distributed learning using partial homomorphic encryption[J].Policy-Based Autonomic Data Governance,2019(11550):154-179. [51]HARDY S,HENECKA W,IVEY-LAW H,et al.Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption[J].arXiv:1711.10677,2017. [52]HAO M,LI H,XU G,et al.Towards efficient and privacy-preserving federated deep learning[C]//IEEE International Confe-rence on Communications.2019:1-6. [53]SAMANGOUEI P,KABKAB M,CHELLAPPA R.Defense-gan:Protecting classifiers against adversarial attacks using ge-nerative models[C]//ICLR.2018:185-195. [54]UDESHI S,PENG S,WOO G,et al.Model agnostic defenceagainst backdoor attacks in machinelearning[J].arXiv:1908.02203,2019. [55]GUO C,RANA M,CISSE M,et al.Countering adversarial images using input transformations[C]//ICLR.2017:1-13. [56]DZIUGAITE G K,GHAHRAMANI Z,ROY D M.A study of the effect of jpg compression on adversarial images[J].arXiv:1608.00853,2016. [57]HUANG R,XU B,SCHUURMANS D,et al.Learning with a strong adversary[C]//ICLR.2017:151-161. [58]FUNG C,YOON C J M,BESCHASTNIKH I.Mitigating sybils in federated learning poisoning[J].arXiv:1808.04866,2018. [59]SHEN S,TOPLE S,SAXENA P.Auror:Defending against poisoning attacks in collaborative deep learning systems[C]//Annual Conference on Computer Security Applications.2016:508-519. [60]YIN D,CHEN Y,KANNAN R,et al.Byzantine-robust distributed learning:Towards optimal statistical rates[C]//InternationalConference on Machine Learning.2018:5650-5659. [61]VILLARREAL-VASQUEZ M,BHARGAVA B.Confoc:Con-tent-focus protection against trojan attacks on neural networks[J].arXiv:2007.00711,2020. [62]LIU K,DOLAN-GAVITT B,GARG S.Fine-pruning:Defendingagainst backdooring attacks on deep neural networks[C]//International Symposium on Research in Attacks,Intrusions,and Defenses.2018:273-294. [63]WANG B,YAO Y,SHAN S,et al.Neural cleanse:Identifying and mitigating backdoor attacks in neural networks[C]//IEEE Symposium on Security and Privacy.2019:707-723. [64]WANG J Z,KONG L W,HUANG Z C,et al.Research advances on privacy protection of federated learning[J].Big data research,2021,7(3):139-149. [65]CHEN B,XIE Y Y,ZHANG J L,et al.Survey of security and privacy in federated learning[J].Journal of NUAA,2020,52(5):675-684. [66]YANG Q.AI and Data Privacy Protection:The Way to Federated Learning[J].Journal of Information Security Research,2019,5(11):961-965. [67]LI T,SAHU A K,ZAHEER M,et al.Federated optimization in heterogeneous networks[C]//Proceedings of Machine Learning and Systems.2020:429-450. [68]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-efficient learning of deep networks from decentralized data[C]//Artificial intelligence and statistics.2017:1273-1282. [69]DWORK C,NAOR M.On the difficulties of disclosure prevention in statistical databases or the case for differential privacy[J].Journal of Privacy and Confidentiality,2010,2(1):1-12. [70]DWORK C.Differential privacy:A survey of results[C]//International conference on theory and applications of models of computation.2008:1-19. [71]BONAWITZ K,IVANOV V,KREUTER B,et al.Practical secure aggregation for privacy-preserving machine learning[C]//Proceedings of ACM SIGSAC.2017:1175-1191. [72]YAO A C.Protocols for secure computations[C]//Annual Symposium on Foundations of Computer Science.1982:160-164. [73]LIU D C,NOCEDAL J.On the limited memory BFGS method for large scale optimization[J].Mathematical Programming,1989,45(1):503-528. [74]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of IEEE Conference on CVPR.2016:770-778. [75]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//ICLR.2015:1-13. [76]PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//IEEE Symposium on Security and Privacy.2016:582-597. [77]BONAWITZ K,EICHNER H,GRIESKAMP W,et al.Towards federated learning at scale:System design[C]//IEEE CVPR Workshop.2019:3310-3319. [78]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends in Theoretical Compu-ter Science,2014,9(3/4):211-407. [79]FRIEDMAN A,SCHUSTER A.Data mining with differential privacy[C]//Proceedings of ACM SIGKDD.2010:493-502. [80]ZHANG J,ZHANG Z,XIAO X,et al.Functional mechanism:regression analysis under differential privacy[J].arXiv:1208.0219,2012. [81]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of ACM SIGSAC.2016:308-318. [82]LI R J,JIA C F,WANG Y F.Multi-key homomorphic proxy re-encryption scheme based on NTRU and its application[J].Journal on Communications,2021,42(3):11-22. [83]ARMKNECHT F,BOYD C,CARR C,et al.A Guide to Fully Homomorphic Encryption[J].IACR Cryptology EPrint Archive,2015(1192):1-35. [84]PAILLIER P.Public-key cryptosystems based on composite degree residuosity classes[C]//International Conference on the Theory and Applications of Cryptographic Techniques.1999:223-238. [85]LINDNER R,PEIKERT C.Better key sizes (and attacks) for LWE-based encryption[C]//Cryptographers’ Track at the RSA Conference.2011:319-339. [86]DAMGÅRD I,JURIK M.A generalisation,a simplification and some applications of Paillier’s probabilistic public-key system[C]//International Workshop on Public Key Cryptography.2001:119-136. [87]DOAN B G,ABBASNEJAD E,RANASINGHE D C.Februus:Input purification defense against trojan attacks on deep neural network systems[C]//Annual Computer Security Applications Conference.2020:897-912. [88]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of IEEE Conference on ICCV.2017:618-626. [89]HUANG X,ALZANTOT M,SRIVASTAVA M.Neuronin-spect:Detecting backdoors in neural networks via output explanations[J].arXiv:1911.07399,2019. [90]CHOQUETTE-CHOO C A,DULLERUD N,DZIEDZIC A,et al.CaPC Learning:Confidential and Private Collaborative Learning[C]//ICLR.2021:212-223. [91]ADI Y,BAUM C,CISSE M,et al.Turning your weakness into astrength:Watermarking deep neural networks by backdooring [C]//USENIX Security.2018:1615-1631. |
[1] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于分层抽样优化的面向异构客户端的联邦学习 Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients 计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263 |
[2] | 冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028 |
[3] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240 |
[6] | 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094 |
[7] | 张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203 |
[8] | 李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208 |
[9] | 赵璐, 袁立明, 郝琨. 多示例学习算法综述 Review of Multi-instance Learning Algorithms 计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047 |
[10] | 肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270 |
[11] | 闫萌, 林英, 聂志深, 曹一凡, 皮欢, 张兰. 一种提高联邦学习模型鲁棒性的训练方法 Training Method to Improve Robustness of Federated Learning 计算机科学, 2022, 49(6A): 496-501. https://doi.org/10.11896/jsjkx.210400298 |
[12] | 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮. 一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion 计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103 |
[13] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于DBSCAN聚类的集群联邦学习方法 Clustered Federated Learning Methods Based on DBSCAN Clustering 计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059 |
[14] | 王飞, 黄涛, 杨晔. 基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion 计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030 |
[15] | 许杰, 祝玉坤, 邢春晓. 机器学习在金融资产定价中的应用研究综述 Application of Machine Learning in Financial Asset Pricing:A Review 计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127 |
|