基于分层抽样优化的面向异构客户端的联邦学习

doi:10.11896/jsjkx.220500263

Abstract

Abstract: Federated learning(FL) is a new distributed learning framework for privacy protection,which is different from traditional distributed machine learning:1)differences in communication,computing,and storage performance among devices(device heterogeneity),2)differences in data distribution and data volume(data heterogeneity),and 3)high communication consumption.Under heterogeneous conditions,the data distribution of clients varies greatly,which leads to the decrease of model convergence speed.Especially in the case of highly heterogeneous condition,the traditional FL algorithm cannot converge and the training loss curve will fluctuate greatly with the increase of local iterations.In this work,a FL algorithm based on stratified sampling optimization(FedSSO) is proposed.In FedSSO,a density-based clustering method is used to divide the overall client into different clusters.Then,some available clients are proportionally extracted from different clusters to participate in training.Therefore,various data are involved in each training round to ensure that FL can accelerate convergence to the optimal solution.The strategy of learning rate decay and the choice of local iterations is set to ensure the convergence.The convergence of FedSSO algorithm is proved theoretically and experimentally,andthe superiority of FedSSO is demonstrated by comparing it with other FL algorithms on public MNIST,Cifar-10,and Sentiment140 datasets.

Key words: Federated learning, Privacy protection, Clustering, Stratified sampling, Distributed optimization, Convergence analysis

CLC Number:

TP301

LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients[J].Computer Science, 2022, 49(9): 183-193.

References

[1]LI T,SAHU A K,TALWALKAR A,et al.Federated Learning:Challenges,Methods,and Future Directions[J].IEEE Signal Processing Magazine,2020,37(3):50-60.
[2]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tionefficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[3]MCMAHAN H B,RAMAGE D,TALWAR K,et al.Learning Differentially Private Recurrent Language Models [J].arXiv:1710.06963,2017.
[4]YANG Q,LIU Y,CHEN T,et al.Federated Machine Learning:Concept and Applications[J].ACM Transactions on Intelligent Systems and Technology,2019,10(2):1-19.
[5]HSIEH K,PHANISHAYEE A,MUTLU O,et al.The Non-IID Data Quagmire of Decentralized Machine Learning [J].arXiv:1910.00189,2020.
[6]LI T,SAHU A K,ZAHEER M,et al.Federated Optimization in Heterogeneous Networks[J].arXiv.1812.06127,2018.
[7]HARD A,RAO K,MATHEWS R,et al.Federated Learning for Mobile Keyboard Prediction [J].arXiv.1811.03604,2018.
[8]YANG T,ANDREW G,EICHNER H,et al.Applied Federated Learning:Improving Google Keyboard Query Suggestions [J].arXiv.1812.02903,2018.
[9]BOYD S,PARIKH N,CHU E,et al.Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers[J].Foundations & Trends in Machine Learning,2010,3(1):1-122.
[10]DEKEL O,GILAD-BACHRACH R,SHAMIR O,et al.Optimal Distributed Online Prediction Using Mini-Batches [J].Journal of Machine Learning Research,2012,13(1):165-202.
[11]RICHTÁRIK P,TAKÁČ M.Distributed Coordinate DescentMethod for Learning with Big Data [J].Journal of Machine Learning Research,2016,17(75):1-25.
[12]ZHANG S,CHOROMANSKA A,LECUN Y.Deep learningwith Elastic Averaging SGD [J].arXiv.1412.6651,2014.
[13]BONAWITZ K,IVANOV V,KREUTER B,et al.Practical Secure Aggregation for Privacy-Preserving Machine Learning[C]//The 2017 ACM SIGSAC Conference.ACM,2017:1175-1191.
[14]BONAWITZ K,EICHNER H,GRIESKAMP W,et al.Towards Federated Learning at Scale:System Design [J].arXiv.1902.01046,2019.
[15]MOHRI M,SIVEK G,SURESH A T.Agnostic FederatedLearning [C]//International Conference on Machine Learning.PMLR,2019.
[16]HU H,WANG D,WU C.Distributed Machine Learning th-rough Heterogeneous Edge Systems[C]//AAAI Conference on Artificial Intelligence.2020:7179-7186.
[17]PETER K,BRENDAN H,MCMAHAN H B,et al.Advancesand Open Problems in Federated Learning [J].arXiv.1912.04977,2019.
[18]ZHAO Y,LI M,LAI L,et al.Federated Learning with Non-IID Data [J].arXiv.1806.00582,2018.
[19]GHOSH A,CHUNG J,DONG Y,et al.An Efficient Framework for Clustered Federated Learning [J].arXiv:2006.04088,2020.
[20]SATTLER F,KR MÜLLER,SAMEK W.Clustered Federated Learning:Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints [J].IEEE Trans Neural Netw Learn Syst,2021,32(8):3710-3722.
[21]YAN Y,NIU C,DING Y,et al.Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability [J].arXiv.2002.07399,2020.
[22]ANKERST M,BREUNIG M M,KRIEGEL H P,et al.OP-TICS:ordering points to identify the clustering structure [J].SIGMOD Record:Special Interest Group on Management Data,1999,28(2):49-60.
[23]LI X,HUANG K,YANG W,et al.On the Convergence of FedAvg on Non-IID Data [J].arXiv:1907.02189,2020.
[24]LECUN Y,BOTTOU L.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.

Related Articles 15

[1]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[2]	CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[3]	CHEN Ming-xin, ZHANG Jun-bo, LI Tian-rui. Survey on Attacks and Defenses in Federated Learning [J]. Computer Science, 2022, 49(7): 310-323.
[4]	LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Clustered Federated Learning Methods Based on DBSCAN Clustering [J]. Computer Science, 2022, 49(6A): 232-237.
[5]	YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[6]	MAO Sen-lin, XIA Zhen, GENG Xin-yu, CHEN Jian-hui, JIANG Hong-xia. FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition [J]. Computer Science, 2022, 49(6A): 285-290.
[7]	CHEN Jing-nian. Acceleration of SVM for Multi-class Classification [J]. Computer Science, 2022, 49(6A): 297-300.
[8]	YAN Meng, LIN Ying, NIE Zhi-shen, CAO Yi-fan, PI Huan, ZHANG Lan. Training Method to Improve Robustness of Federated Learning [J]. Computer Science, 2022, 49(6A): 496-501.
[9]	Ran WANG, Jiang-tian NIE, Yang ZHANG, Kun ZHU. Clustering-based Demand Response for Intelligent Energy Management in 6G-enabled Smart Grids [J]. Computer Science, 2022, 49(6): 44-54.
[10]	CHEN Jia-zhou, ZHAO Yi-bo, XU Yang-hui, MA Ji, JIN Ling-feng, QIN Xu-jia. Small Object Detection in 3D Urban Scenes [J]. Computer Science, 2022, 49(6): 238-244.
[11]	XING Yun-bing, LONG Guang-yu, HU Chun-yu, HU Li-sha. Human Activity Recognition Method Based on Class Increment SVM [J]. Computer Science, 2022, 49(5): 78-83.
[12]	ZHU Zhe-qing, GENG Hai-jun, QIAN Yu-hua. Line-Segment Clustering Algorithm for Chemical Structure [J]. Computer Science, 2022, 49(5): 113-119.
[13]	ZHANG Yu-jiao, HUANG Rui, ZHANG Fu-quan, SUI Dong, ZHANG Hu. Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization [J]. Computer Science, 2022, 49(5): 165-169.
[14]	WANG Mei-shan, YAO Lan, GAO Fu-xiang, XU Jun-can. Study on Differential Privacy Protection for Medical Set-Valued Data [J]. Computer Science, 2022, 49(4): 362-368.
[15]	ZUO Yuan-lin, GONG Yue-jiao, CHEN Wei-neng. Budget-aware Influence Maximization in Social Networks [J]. Computer Science, 2022, 49(4): 100-109.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0