计算机科学 ›› 2022, Vol. 49 ›› Issue (9): 297-305.doi: 10.11896/jsjkx.210800108

• 信息安全 • 上一篇    下一篇

基于安全多方计算和差分隐私的联邦学习方案

汤凌韬1, 王迪1, 张鲁飞1, 刘盛云2   

  1. 1 数学工程与先进计算国家重点实验室 江苏 无锡 214125
    2 上海交通大学网络空间安全学院 上海 200240
  • 收稿日期:2021-08-12 修回日期:2022-02-27 出版日期:2022-09-15 发布日期:2022-09-09
  • 通讯作者: 刘盛云(shengyun.liu@sjtu.edu.cn)
  • 作者简介:(tangbdy@126.com)
  • 基金资助:
    国家重点研发计划(2016YFB1000500);国家科技重大专项课题(2018ZX01028102)

Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy

TANG Ling-tao1, WANG Di1, ZHANG Lu-fei1, LIU Sheng-yun2   

  1. 1 State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi,Jiangsu 214125,China
    2 School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
  • Received:2021-08-12 Revised:2022-02-27 Online:2022-09-15 Published:2022-09-09
  • About author:TANG Ling-tao,born in 1994,Ph.D candidate.His main research interests include information security and privacy-preserving machine learning.
    LIU Sheng-yun,born in 1985,Ph.D,associate professor.His main research interests include blockchain,secure multi-party computation,distributed storage system and federated learning.
  • Supported by:
    National Key Research and Development Program of China(2016YFB1000500) and National Science and Technology Major Project(2018ZX01028102).

摘要: 联邦学习为非互信实体间的合作学习提供了一种新的解决思路,通过本地训练和中央聚合的模式,在训练全局模型的同时保护各实体的本地数据隐私。然而相关研究表明,该模式下无论是用户上传的局部模型,还是中央聚合的全局模型,都会泄露用户数据的信息。安全多方计算和差分隐私作为两种主流的隐私保护技术,分别保护计算过程和计算结果的隐私。目前很少有工作结合这两种技术的优势,用于联邦学习训练全流程的防护。将安全多方计算、差分隐私相结合,设计了一种面向深度学习的隐私保护联邦学习方案,用户对本地训练得到的局部模型添加扰动,并秘密共享至多个中央服务器,服务器间通过多方计算协议对局部模型进行聚合,得到一个共享的秘密全局模型。该方案在保护用户上传的局部信息不被窃取的同时,防止敌手从聚合模型等全局共享信息展开恶意推断,并具有容忍用户掉线和兼容多种聚合函数等优点。此外,针对不存在可信中心的现实应用,上述方案可自然拓展至去中心化场景。实验表明,所提方案与相同条件下的明文联邦学习效率相近,且能取得同水平的模型准确率。

关键词: 联邦学习, 安全多方计算, 差分隐私, 隐私保护, 深度学习

Abstract: Federated learning provides a novel solution to collaborative learning among untrusted entities. Through a local-trai-ning-and-central-aggregation pattern,the federated learning algorithm trains a global model while protects local data privacy of each entity. However,recent studies show that local models uploaded by clients and global models produced by the server may still leak users' private information. Secure multi-party computation and differential privacy are two mainstream privacy-preserving techniques,which are used to protect the privacy of computation process and computation outputs respectively. There are few works that exploit the benefits of these two techniques at the same time. This paper proposes a privacy-preserving federated learning scheme for deep learning by combining secure multi-party computation and differential privacy. Clients add noise to local models,and secret share them to multiple servers. Servers aggregate these model shares by secure multi-party computation to obtain a private global model. The proposed scheme not only protects the privacy of local model updates uploaded by clients,but also prevents adversaries from inferring sensitive information from globally shared data such as aggregated models. The scheme also allows dropout of unstable clients and is compatible with complex aggregation functions. In addition,it can be naturally extended to the decentralized setting for real-world applications where no trusted centers exist. We implement our system in Python and Pytorch. Experiments validate that the proposed scheme achieves the same level of efficiency and accuracy as plaintext fede-rated learning.

Key words: Federated learning, Secure multi-party computation, Differential privacy, Privacy preserving, Deep learning

中图分类号: 

  • TP309
[1]MCMAHAN B,MOORE E,RAMAGE D,et al.Communication-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[2]ZHU L,LIU Z,HAN S.Deep leakage from gradients[C]//Advances in Neural Information Processing Systems.2019:14747-14756.
[3]ZHAO B,MOPURI K R,BILEN H.idlg:Improved deep leakage from gradients[J].arXiv:2001.02610,2020.
[4]GEIPING J,BAUERMEISTER H,DRÖGEH,et al.Inverting Gradients How easy is it to break privacy in federated learning?[J].arXiv:2003.14053,2020.
[5]SHOKRI R,STRONATI M,SONG C,et al.Membership infe-rence attacks against machine learning models[C]//2017 IEEE Symposium on Security and Privacy.IEEE,2017:3-18.
[6]NASR M,SHOKRI R,HOUMANSADR A.Comprehensive privacy analysis of deep learning:Passive and active white-box inference attacks against centralized and federated learning[C]//2019 IEEE symposium on security and privacy(SP).IEEE,2019:739-753.
[7]YAO A C.Protocols for secure computations[C]//23rd Annual Symposium on Foundations of Computer Science.IEEE,1982:160-164.
[8]SHAMIR A.How to share a secret[J].Communications of the ACM,1979,22(11):612-613.
[9]CRAMER R,DAMGÅRD I,MAURERU.General secure multi-party computation from any linear secret-sharing scheme[C]//International Conference on the Theory and Applications of Cryptographic Techniques.Berlin:Springer,2000:316-334.
[10]DAMGÅRD I,FITZI M,KILTZ E,et al.Unconditionally secure constant-rounds multi-party computation for equality,comparison,bits and exponentiation[C]//Theory of Cryptography Conference.Berlin:Springer,2006:285-304.
[11]BENDLIN R,DAMGÅRD I,ORLANDI C,et al.Semi-hom-omorphic encryption and multiparty computation[C]//Annual International Conference on the Theory and Applications of Cryptographic Techniques.Berlin:Springer, 2011:169-188.
[12]DAMGÅRD I,PASTRO V,SMART N,et al.Multiparty computation from somewhat homomorphic encryption[C]//Annual Cryptology Conference.Berlin:Springer,2012:643-662.
[13]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends in Theoretical Compu-ter Science,2014,9(3/4):211-407.
[14]DWORK C,MCSHERRY F,NISSIM K,et al.Calibrating noise to sensitivity in private data analysis[C]//Theory of Cryptography Conference.Berlin:Springer,2006:265-284.
[15]MCSHERRY F,TALWAR K.Mechanism design via differential privacy[C]//48th Annual IEEE Symposium on Foundations of Computer Science.IEEE,2007:94-103.
[16]BONAWITZ K,IVANOV V,KREUTER B,et al.Practical secure aggregation for privacy-preserving machine learning[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:1175-1191.
[17]GEYER R C,KLEIN T,NABI M.Differentially private federated learning:A client level perspective[J].arXiv:1712.07557,2017.
[18]AGARWAL N,SURESH A T,YU F,et al.cpSGD:Communication-efficient and differentially-private distributed SGD[J].arXiv:1805.10559,2018.
[19]TRUEX S,BARACALDO N,ANWAR A,et al.A hybrid ap-proach to privacy-preserving federated learning[C]//Procee-dings of the 12th ACM Workshop on Artificial Intelligence and Security.2019:1-11.
[20]XU R,BARACALDO N,ZHOU Y,et al.Hybridalpha:An efficient approach for privacy-preserving federated learning[C]//Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security.2019:13-23.
[21]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.2016:308-318.
[22]CATRINA O,SAXENA A.Secure computation with fixed-point numbers[C]//International Conference on Financial Cryptography and Data Security.Berlin:Springer,2010:35-50.
[23]KRIPS T,WILLEMSON J.Hybrid model of fixed and floating point numbers in secure multiparty computations[C]//International Conference on Information Security.Cham:Springer,2014:179-197.
[24]SO J,GÜLER B,AVESTIMEHR A S.Byzantine-resilient se-cure federated learning[J].arXiv:2007.11115,2020.
[25]ZHAO Y,LI M,LAI L,et al.Federated learning with non-iid data[J].arXiv:1806.00582,2018.
[26]LI T,SAHU A K,ZAHEER M,et al.Federated optimization in heterogeneous networks[J].arXiv:1812.06127,2018.
[1] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 吕由, 吴文渊.
隐私保护线性回归方案与应用
Privacy-preserving Linear Regression Scheme and Its Application
计算机科学, 2022, 49(9): 318-325. https://doi.org/10.11896/jsjkx.220300190
[4] 窦家维.
保护隐私的汉明距离与编辑距离计算及应用
Privacy-preserving Hamming and Edit Distance Computation and Applications
计算机科学, 2022, 49(9): 355-360. https://doi.org/10.11896/jsjkx.220100241
[5] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[6] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[7] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[8] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[14] 黄觉, 周春来.
基于本地化差分隐私的频率特征提取
Frequency Feature Extraction Based on Localized Differential Privacy
计算机科学, 2022, 49(7): 350-356. https://doi.org/10.11896/jsjkx.210900229
[15] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!