计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 391-398.doi: 10.11896/jsjkx.230400182

• 信息安全 • 上一篇    下一篇

基于粒子群优化的面向数据异构的联邦学习方法

徐奕成, 戴超凡, 马武彬, 吴亚辉, 周浩浩, 鲁晨阳   

  1. 国防科技大学信息系统工程重点实验室 长沙 410073
  • 收稿日期:2023-04-22 修回日期:2023-09-11 出版日期:2024-06-15 发布日期:2024-06-05
  • 通讯作者: 马武彬(wb_ma@nudt.edu.cn)
  • 作者简介:(xuyicheng18@nudt.edu.cn)
  • 基金资助:
    国家自然科学基金面上项目(61871388)

Particle Swarm Optimization-based Federated Learning Method for Heterogeneous Data

XU Yicheng, DAI Chaofan, MA Wubin, WU Yahui, ZHOU Haohao, LU Chenyang   

  1. National Key Laboratory of Information Systems Engineering,National University of Defense Technology,Changsha 410073,China
  • Received:2023-04-22 Revised:2023-09-11 Online:2024-06-15 Published:2024-06-05
  • About author:XU Yicheng,born in 1998,postgra-duate.His main research interests include federated learning and optimization algorithm.
    MA Wubin,born in 1986,Ph.D,asso-ciate researcher.His main research interests include multi-objective optimization,micro-service and data mining.
  • Supported by:
    General Program of National Natural Science Foundation of China(61871388).

摘要: 联邦学习是一种新兴的面向隐私保护的分布式机器学习框架,其核心特点是能够在不获取客户端原始数据的条件下实现分布式机器学习。客户端利用本地数据进行模型训练,然后将模型参数上传至服务端进行聚合,从而确保客户端数据始终得到保护。在此过程中,存在频繁的参数传输导致的通信成本高昂问题和各客户端所拥有的非独立同分布异构数据问题,两者严重制约了联邦学习的应用。针对上述问题,提出了一种基于粒子群优化的面向数据异构的联邦学习方法——FedPSG,将客户端传输到服务器的数据形式由模型参数转变为模型分值,在每轮训练中只需要少部分客户端向服务器上传模型参数,从而降低通信成本;同时,提出了一种模型再训练策略,使用服务器数据对全局模型进行二次迭代训练,通过缓解数据异构问题对联邦学习的影响来进一步提升模型性能。模拟不同的数据异构环境,在MNIST,FashionMNIST与CIFAR-10数据集上进行实验,结果表明FedPSG能够有效提高模型在不同数据异构环境下的准确率,并且验证了模型再训练策略能有效解决客户端数据异构问题。

关键词: 联邦学习, 粒子群算法, 通信成本, 数据异构, 隐私保护

Abstract: Federated learning is an emerging privacy-preserving distributed machine learning framework,whose core feature is the ability to implement distributed machine learning without access to the client’s raw data.The client uses local data for model training and then uploads the model parameters to the server for aggregation,thus ensuring that the client data is always protected.In this process,there are problems of high communication costs due to frequent parameter transfers and non-independent homogeneous heterogeneous data owned by each client,both of which severely limit the application of federated learning.To address these problems,FedPSG,a federated learning method based on particle swarm optimization for data heterogeneity,is proposed to reduce the communication cost by changing the form of data transferred from the client to the server from model para-meters to model scores,so that only a small number of clients need to upload model parameters to the server in each training round.Meanwhile,a model retraining strategy is proposed to use the server data to train the global model for a second iteration,further improving the model performance by mitigating the impact of data heterogeneity issues on federated learning.Simulating different data heterogeneous environments,experiments are conducted on MNIST,FashionMNIST and CIFAR-10 datasets.The results show that FedPSG can effectively improve the accuracy of the model in different data heterogeneous environments,and verify that the model retraining strategy can effectively solve the client-side data heterogeneity problem.

Key words: Federated learning, Particle swarm algorithm, Communication cost, Data heterogeneity, Privacy protection

中图分类号: 

  • TP301
[1]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-Efficient Learning of Deep Networks from Decentralized Data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[2]RODRÍGUEZ-BARROSO N,JIMÉNEZ-LÓPEZ D,LUZÓN M V,et al.Survey on Federated Learning Threats:concepts,taxo-nomy on attacks and defences,experimental study and challenges[J].Information Fusion,2023,90:148-173.
[3]ALMANIFI O R A,CHOW C O,THAM M L,et al.Communication and computation efficiency in Federated Learning:A survey[J].Internet of Things,2023,22:100742.
[4]MA X,ZHU J,LIN Z,et al.A state-of-the-art survey on solving non-IID data in Federated Learning[J].Future Generation Computer Systems,2022,135:244-258.
[5]LI Z,SHARMA V,MOHANTY SP.Preserving Data Privacy via Federated Learning:Challenges and Solutions[J].IEEE Consumer Electronics Magazine,2020,9(3):8-16.
[6]CHEN M,BEUTEL A,COVINGTON P,et al.Top-K Off-Policy Correction for a REINFORCE Recommender System[C]//Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining(WSDM ’19).2019:456-464.
[7]NISHIO T,YONETANI R.Client Selection for FederatedLearning with Heterogeneous Resources in Mobile Edge[C]//2019 IEEE International Conference on Communications(ICC 2019).2019:1-7.
[8]YUROCHKIN M,AGARWAL M,GHOSH S,et al.BayesianNonparametric Federated Learning of Neural Networks[J].arXiv:1905.12022,2019.
[9]ZHAO Y,LI M,LAI L,et al.Federated Learning with Non-IID Data[J].arXiv:1806.00582,2018.
[10]YOSHIDA N,NISHIO T,MORIKURA M,et al.Hybrid-FL for Wireless Networks:Cooperative Learning Mechanism Using Non-IID Data[C]//2020 IEEE International Conference on Communications(ICC 2020).2020:1-7.
[11]TUOR T,WANG S,KO B J,et al.Overcoming Noisy and Irrele-vant Data in Federated Learning[C]//2020 25th International Conference on Pattern Recognition.2021:5020-5027.
[12]WANG H,KAPLAN Z,NIU D,et al.Optimizing FederatedLearning on Non-IID Data with Reinforcement Learning[C]//IEEE Conference on Computer Communications(INFOCOM 2020).IEEE ,2020:1698-1707.
[13]CAI L,LIN D,ZHANG J,et al.Dynamic Sample Selection for Federated Learning with Heterogeneous Data in Fog Computing[C]//2020 IEEE International Conference on Communications(ICC 2020).2020:1-6.
[14]XU X,DUAN S,ZHANG J,et al.Optimizing Federated Lear-ning on Device Heterogeneity with A Sampling Strategy[C]//2021 IEEE/ACM 29th International Symposium on Quality of Service.2021:1-10.
[15]POLI R,KENNEDY J,BLACKWELL T.Particle swarm optimization:An overview[J].Swarm Intelligence,2007,1(1):33-57.
[16]SERIZAWA T,FUJITA H.Optimization of Convolutional Neural Network Using the Linearly Decreasing Weight Particle Swarm Optimization[J].arXiv:2001.05670,2020.
[17]WANG B,XUE B,ZHANG M.Particle Swarm Optimisation for Evolving Deep Neural Networks for Image Classification by Evolving and Stacking Transferable Blocks[C]//2020 IEEE Congress on Evolutionary Computation.2020:1-8.
[18]SYULISTYO A R,PURNOMO D M J,RACHMADI M F,et al.Particle Swarm Optimization(PSO) For Training Optimization on Convolutional Neural Network(CNN)[J].Jurnal Ilmu Komputer dan Informasi,2016,9(1):52-58.
[19]QOLOMANY B,AHMAD K,AL-FUQAHA A,et al.ParticleSwarm Optimized Federated Learning For Industrial IoT and Smart City Services[C]//2020 IEEE Global Communications Conference(GLOBECOM 2020).2020:1-6.
[20]PARK S,SUH Y,LEE J.FedPSO:Federated Learning UsingParticle Swarm Optimization to Reduce Communication Costs[J].Sensors,2021,21(2):600.
[21]KENNEDY J,EBERHART R.Particle swarm optimization[C]//Proceedings of ICNN’95-International Conference on Neural Networks.1995,4:1942-1948.
[22]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,2014.
[23]HOWARD A G,ZHU M,CHEN B,et al.MobileNets:Efficient Convolutional Neural Networks for Mobile Vision Applications[J].arXiv:1704.04861,2017.
[24]JEONG E,OH S,KIM H,et al.Communication-Efficient On-Device Machine Learning:Federated Distillation and Augmentation under Non-IID Private Data[J].arXiv:1811.11479,2018.
[25]YAO X,HUANG T,ZHANG R X,et al.Federated Learningwith Unbiased Gradient Aggregation and Controllable Meta Updating[J].arXiv:1910.08234,2020.
[26]DENG L.The MNIST Database of Handwritten Digit Images for Machine Learning Research[Best of the Web][J].IEEE Signal Processing Magazine,2012,29(6):141-142.
[27]HAN X,RASUL K,VOLLGRAF R.Fashion-MNIST:a Novel Image Dataset for Benchmarking Machine Learning Algorithms[J].arXiv:1708.07747,2017.
[28] KRIZHEVSKYA.Learning multiple layers of features from tiny images[J].Handbook of Systemic Autoimmune Diseases,2009,1(4):1-60.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!