计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 22-32.doi: 10.11896/jsjkx.220500240

• 联邦学习* 上一篇    下一篇

隐私保护的非线性联邦支持向量机研究

杨鸿健, 胡学先, 李可佳, 徐阳, 魏江宏   

  1. 中国人民解放军战略支援部队信息工程大学数据与目标工程学院 郑州450001
  • 收稿日期:2022-05-26 修回日期:2022-07-07 发布日期:2022-12-14
  • 通讯作者: 胡学先(xuexian_hu@hotmail.com)
  • 作者简介:(henuyanghongjian@163.com)
  • 基金资助:
    国家自然科学基金(62172433,62172434,61862011,61872449)

Study on Privacy-preserving Nonlinear Federated Support Vector Machines

YANG Hong-jian, HU Xue-xian, LI Ke-jia, XU Yang, WEI Jiang-hong   

  1. School of Data and Target Engineering,PLA Strategic Support Force Information Engineering University,Zhengzhou 450001,China
  • Received:2022-05-26 Revised:2022-07-07 Published:2022-12-14
  • About author:YANG Hong-jian,born in 1998,postgraduate.His main research interests include federated learning,homomorphic encryption and blockchain.HU Xue-xian,born in 1982,Ph.D,associate professor,master supervisor.His main research interests include big data security,applied cryptography and network security.
  • Supported by:
    National Natural Science Foundation of China(62172433,62172434,61862011,61872449).

摘要: 联邦学习为解决“数据孤岛”下的多方联合建模问题提出了新的思路。联邦支持向量机能够在数据不出本地的前提下实现跨设备的支持向量机建模,然而现有研究存在训练过程中隐私保护不足、缺乏针对非线性联邦支持向量机的研究等缺陷。针对以上问题,利用随机傅里叶特征方法和CKKS同态加密机制,提出了一种隐私保护的非线性联邦支持向量机训练(PPNLFedSVM)算法。首先,基于随机傅里叶特征方法在各参与方本地生成相同的高斯核近似映射函数,将各参与方的训练数据由低维空间显式映射至高维空间中;其次,基于CKKS密码体制的模型参数安全聚合算法,保障模型聚合过程中各参与方模型参数及其贡献的隐私性,并结合CKKS密码体制的特性对参数聚合过程进行针对性优化调整,以提高安全聚合算法的效率。针对安全性的理论分析和实验结果表明,PPNLFedSVM算法可以在不损失模型精度的前提下,保证参与方模型参数及其贡献在训练过程中的隐私性。

关键词: 联邦学习, 隐私保护, 同态加密, 支持向量机, 多方安全随机种子协商, 随机傅里叶特征

Abstract: Federated learning offers new ideas for solving the problem of multiparty joint modeling in “data silos”.Federated support vector machines can realize cross-device support vector machine modeling without local data,but the existing research has some defects such as insufficient privacy protection in a training process and a lack of research on nonlinear federated support vector machines.To solve the above problems,this paper utilizes the stochastic Fourier feature method and CKKS homomorphic encryption system to propose a nonlinear federated support vector machine training(PPNLFedSVM) algorithm for privacy protection.Firstly,the same Gaussian kernel approximate mapping function is generated locally for each participant based on the random Fourier feature method,and the training data of each participant is explicitly mapped from the low-dimensional space to the high-dimensional space.Secondly,the model parameter security aggregation algorithm based on CKKS cryptography ensures the privacy of model parameters and their contributions during the model aggregation process.Moreover,the parameter aggregation process is optimized and adjusted according to the characteristics of CKKS cryptography to improve the efficiency of the security aggregation algorithm.Security analysis and experimental results show that the PPNLFedSVM algorithm can ensure the privacy of participant model parameters and their contributions to the training process without losing the model accuracy.

Key words: Federated learning, Privacy preserving, Homomorphic encryption, Support vector machines, Multi-party secure random seed negotiation, Random Fourier features

中图分类号: 

  • TP309.2
[1]ZHUANG M Q,TAN X H,FAN Y C,et al.3D animation expression generation and emotional supervision based on convolutional neural network[J].Journal of Chongqing University of Technology(Natural Science),2022,36(1):151-158.
[2]WANG Z,GUO Y,NIE Z,et al.Privacy protection and cost management of smart meters based on dueling double deep Q-learning[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2021,33(4):554-561.
[3]WANG J,XU Y H,LI L.Data fusion privacy protection method with low energy consumption and integrity verification[J].Journal of Jilin University(Engineering and Technology Edition),2022,52(7):1657-1665.
[4]LI Q X,ZHOU Q X,WANG Z L,et al.Provable Secure Delegation Computing Protocol Based on Privacy Protection[J].Computer Engineering,2021,47(5):131-137.
[5]YANG W Q,ZHANG Y,NIE J T,et al.Energy and Information Management Strategy Based on Federated Learning for Wireless Network Nodes[J].Computer Engineering,2022,48(1):188-196,203.
[6]LIU Y X,CHEN H,LIU Y H,et al.Privacy-Preserving Techniques in Federated Learning [J].Ruan Jian Xue Bao/Journal of Software,2022,33(3):1057-1092.
[7]WEN Y L,CHEN M J.Medical Data Sharing Scheme Combined with Federal Learning and Blockchain[J].Computer Enginee-ring,2022,48(5):145-153,161.
[8]ZHU L,LIU Z,HAN S.Deep Leakage from Gradients [J].Advances in Neural Information Processing Systems,2019,32:1-11.
[9]ZHAO B,MOPURI K R,BILEN H.iDLG:Improved Deep Lea-kage from Gradients [J].arXiv:2001.02610,2020.
[10]WANG Z,SONG M,ZHANG Z,et al.Beyond Inferring Class Representatives:User-Level Privacy Leakage from Federated Learning[C]//IEEE INFOCOM 2019-IEEE Conference on Computer Communications.IEEE,2019:2512-2520.
[11]BAKOPOULOU E,TILLMAN B,MARKOPOULOU A.AFederated Learning Approach for Mobile Packet Classification [J].arXiv:1907.13113,2019.
[12]GE N,LI G H,ZHANG L,et al.Failure Prediction in Production Line Based on Federated Learning:An Empirical Study [J].arXiv:2101.11715,2021.
[13]HARTMANN V,MODI K,PUJOL J M,et al.Privacy-Preserving Classification with Secret Vector Machines[C]//Procee-dings of the 29th ACM InternationalConference on Infor-mation & Knowledge Management.2020:475-484.
[14]BURMESTER M,DESMEDT Y.A Secure and EfficientConfe-rence Key Distribution System[C]//Workshop on the Theory and Application of Cryptographic Techniques.Berlin:Springer,1994:275-286.
[15]CHEON J H,KIM A,KIM M,et al.Homomorphic Encryption for Arithmetic of Approximate Numbers[C]//International Conference on the Theory and Application of Cryptology and Information Security.Cham:Springer,2017:409-437.
[16]YU H,VAIDYA J,JIANG X.Privacy-Preserving SVM Classification on Vertically Partitioned Data[C]//Pacific-Asia Confe-rence on Knowledge Discovery and Data Mining.Berlin:Sprin-ger, 2006:647-656.
[17]YU H,JIANG X,VAIDYA J.Privacy-Preserving SVM Using Nonlinear Kernels on Horizontally Partitioned Data[C]//Proceedings of the 2006 ACM Symposium on Applied Computing.2006:603-610.
[18]VAIDYA J,YU H,JIANG X.Privacy-Preserving SVM Classification [J].Knowledge and Information Systems,2008,14(2):161-178.
[19]MANGASARIAN O L,WILD E W.Privacy-Preserving Classification of Horizontally Partitioned Data via Random Kernels[C]//Proceedings of the 2008 International Conference on Data Mining.Las Vegas,USA,2008:473-479.
[20]LEE Y J,MANGASARIAN O L.RSVM:Reduced Support Vector Machines[C]//Proceedings of the 2001 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics.2001:1-17.
[21]SUN L,MU W S,QI B,et al.A New Privacy-Preserving Proximal Support Vector Machine for Classification of Vertically Partitioned Data [J].International Journal of Machine Learning and Cybernetics,2015,6(1):109-118.
[22]LIU X,DENG R H,CHOO K K R,et al.Privacy-PreservingOutsourced Support Vector Machine Design for Secure Drug Discovery [J].IEEE Transactions on Cloud Computing,2018,8(2):610-622.
[23]LIU X,DENG R H,CHOO K K R,et al.An Efficient Privacy-Preserving Outsourced Calculation Toolkit with Multiple Keys [J].IEEE Transactions on Information Forensics and Security,2016,11(11):2401-2414.
[24]WANG J,WU L,WANG H,et al.An Efficient and Privacy-Preserving Outsourced Support Vector Machine Training for Internet of Medical Things [J].IEEE Internet of Things Journal,2020,8(1):458-473.
[25]MCMAHAN B,MOORE E,RAMAGE D,et al.Communic-ationEfficient Learning of Deep Networks from Decentralized Data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[26]RIVEST R L,ADLEMAN L,DERTOUZOS M L.On DataBanks and Privacy Homomorphisms [J].Foundations of Secure Computation,1978,4(11):169-180.
[27]LYU L,YU H,YANG Q.Threats to Federated Learning:A Survey [J].arXiv:2003.02133,2020.
[28]RAHIMI A,RECHT B.Random Features for Large-Scale Kernel Machines [J].Advances in Neural Information Processing Systems,2007,20:1177-1184.
[29]RUDIN W.Fourier Analysis on Groups[M].New York:Courier Dover Publications,2017.
[30]GREGORY G.Predicts Random Fourier Features[EB/OL].(2019-12-23) [2022-05-24].http://gregorygundersen.com/blog/2019/12/23/random-fourier-features/.
[31]CHEON J H,HONG S,KIM D.Remark on the Security ofCKKS Scheme in Practice [EB/OL].(2020-12-21) [2022-05-26].https://eprint.iacr.org/2020/1581.pdf.
[32]ODED G.Foundations of Cryptography-Basic Applications[M].Cambridge:Cambridge University Press,2004.
[33]BOST R,POPA R A,TU S,et al.Machine Learning Classification over Encrypted Data[C]//Network and Distributed System Security Symposium.2014.
[1] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[2] 吕由, 吴文渊.
隐私保护线性回归方案与应用
Privacy-preserving Linear Regression Scheme and Its Application
计算机科学, 2022, 49(9): 318-325. https://doi.org/10.11896/jsjkx.220300190
[3] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[4] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[5] 闫萌, 林英, 聂志深, 曹一凡, 皮欢, 张兰.
一种提高联邦学习模型鲁棒性的训练方法
Training Method to Improve Robustness of Federated Learning
计算机科学, 2022, 49(6A): 496-501. https://doi.org/10.11896/jsjkx.210400298
[6] 王健.
基于隐私保护的反向传播神经网络学习算法
Back-propagation Neural Network Learning Algorithm Based on Privacy Preserving
计算机科学, 2022, 49(6A): 575-580. https://doi.org/10.11896/jsjkx.211100155
[7] 单晓英, 任迎春.
基于改进麻雀搜索优化支持向量机的渔船捕捞方式识别
Fishing Type Identification of Marine Fishing Vessels Based on Support Vector Machine Optimized by Improved Sparrow Search Algorithm
计算机科学, 2022, 49(6A): 211-216. https://doi.org/10.11896/jsjkx.220300216
[8] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于DBSCAN聚类的集群联邦学习方法
Clustered Federated Learning Methods Based on DBSCAN Clustering
计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059
[9] 陈景年.
一种适于多分类问题的支持向量机加速方法
Acceleration of SVM for Multi-class Classification
计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149
[10] 侯夏晔, 陈海燕, 张兵, 袁立罡, 贾亦真.
一种基于支持向量机的主动度量学习算法
Active Metric Learning Based on Support Vector Machines
计算机科学, 2022, 49(6A): 113-118. https://doi.org/10.11896/jsjkx.210500034
[11] 邢云冰, 龙广玉, 胡春雨, 忽丽莎.
基于SVM的类别增量人体活动识别方法
Human Activity Recognition Method Based on Class Increment SVM
计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024
[12] 李利, 何欣, 韩志杰.
群智感知的隐私保护研究综述
Review of Privacy-preserving Mechanisms in Crowdsensing
计算机科学, 2022, 49(5): 303-310. https://doi.org/10.11896/jsjkx.210400077
[13] 秦小月, 黄汝维, 杨波.
基于素数幂次阶分圆环的NTRU型全同态加密方案
NTRU Type Fully Homomorphic Encryption Scheme over Prime Power Cyclotomic Rings
计算机科学, 2022, 49(5): 341-346. https://doi.org/10.11896/jsjkx.210300089
[14] 王美珊, 姚兰, 高福祥, 徐军灿.
面向医疗集值数据的差分隐私保护技术研究
Study on Differential Privacy Protection for Medical Set-Valued Data
计算机科学, 2022, 49(4): 362-368. https://doi.org/10.11896/jsjkx.210300032
[15] 杜辉, 李卓, 陈昕.
基于在线双边拍卖的分层联邦学习激励机制
Incentive Mechanism for Hierarchical Federated Learning Based on Online Double Auction
计算机科学, 2022, 49(3): 23-30. https://doi.org/10.11896/jsjkx.210800051
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!