计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221000243-5.doi: 10.11896/jsjkx.221000243

• 大数据&数据科学 • 上一篇    下一篇

聚类联邦学习簇间优化

李仁杰, 闫巧   

  1. 深圳大学计算机与软件学院 广东 深圳 518000
  • 发布日期:2023-11-09
  • 通讯作者: 闫巧(yanq@szu.edu.cn)
  • 作者简介:(1441347985@qq.com)
  • 基金资助:
    国家自然科学基金(61976142);深圳市科技计划项目(JCYJ20210324093609025)

Inter-cluster Optimization for Cluster Federated Learning

LI Renjie, YAN Qiao   

  1. School of Computer and Software,Shenzhen University,Shenzhen,Guangdong 518000,China
  • Published:2023-11-09
  • About author:LI Renjie,born in 1997,postgraduate.His main research interests include fe-derated learning and machine learning.
    YAN Qiao,born in 1972,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include network security,cloud computing,software-defined networking,and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61976142) and Shenzhen Science and Technology Plan Project(JCYJ20210324093609025).

摘要: 聚类联邦学习常用于解决联邦学习中数据异质性导致准确率下降的问题,通过聚类算法将数据分布相似的客户端划分到相同簇中,簇模型用于解决某个特定分布问题。当前聚类联邦学习中为了获得好的实验结果,研究者通常将相同的分布训练集和测试集分配到同一个簇中,然而现实中无法达到实验理想效果,本地客户端中使用模型的数据集与训练模型的数据集分布可能不同,当分布不同时聚类联邦学习的簇模型准确率会大幅下降,影响本地端设备的簇间准确率。文中提出两种方案提升聚类联邦学习中簇模型的簇间准确率。第一种方案是自适应聚类联邦学习(AWCFL),在簇内聚合时加入其他簇的模型,使得簇模型学习到其他分布的知识,有效提升簇模型的簇间准确率;第二种方案是多分布聚类联邦学习(MCFL),将簇模型同步到每个客户端,让客户端选择合适的模型使用,该方案相对于第一种方案簇内准确率不会下降,簇间准确率提升明显。上述两种方案在Mnist和EMnist数据集上进行实验,与IFCA,CFL(Clustered Federated Learning)和FedAvg进行比较,簇间准确率明显提升。

关键词: 联邦学习, 聚类, 聚合优化, 数据分布

Abstract: Clustering federated learning is often used to solve the problem of decreasing accuracy caused by data heterogeneity in federated learning.The idea is to group clients with similar data distributions into the same cluster using clustering algorithms,and then train a cluster model specifically for that distribution.However,in practical applications,it is challenging to achieve ideal results because the training and test datasets used by the local clients may not match the data distribution of the cluster model,leading to a significant drop in inter-cluster accuracy.To improve the accuracy of the cluster model in clustering federatedlear-ning,this paper proposes two solutions.The first is adaptive weighted clustering federated learning(AWCFL),which incorporates models from other clusters during intra-cluster aggregation,enabling the cluster model to learn from other distributions and effectively improve inter-cluster accuracy.The second solution is multi-distribution clustering federated learning(MCFL),which synchronizes the cluster model with each client,allowing clients to choose the appropriate model to use.Compared with the first solution,intra-cluster accuracy remains unaffected in MCFL,while inter-cluster accuracy is significantly improved.To evaluate the proposed solutions,experiments are conducted on the Mnist and EMnist datasets.Compared with IFCA,clustered federated lear-ning(CFL) and FedAvg,the accuracy rate between clusters is significantly improved.

Key words: Federated learning, Clustering, Aggregation optimization, Data distribution

中图分类号: 

  • TP301
[1]MCMAHAN B,MOORE E,RAMAGED,et al.Communication-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[2]HARD A,RAO K,MATHEWS R,et al.Federated learning for mobile keyboard prediction[J].arXiv:1811.03604,2018.
[3]BRISIMI T S,CHEN R,MELA T,et al.Federated learning of predictive models from federated electronic health records[J].International Journal of Medical Informatics,2018,112:59-67.
[4]SIM K C,ZADRAZIL P,BEAUFAYS F.An investigation intoon-device personalization of end-to-end automatic speech recognition models[J].arXiv:1909.06678,2019.
[5]MANSOUR Y,MOHRI M,RO J,et al.Three approaches forpersonalization with applications to federated learning[J].ar-Xiv:2002.10619,2020.
[6]MARFOQ O,NEGLIA G,BELLET A,et al.Federated multi-task learning under a mixture of distributions[J].Advances in Neural Information Processing Systems,2021,34:15434-15447.
[7]SMITH V,CHIANG C K,SANJABI M,et al.Federated multi-task learning[J].arXiv:1705.10467,2018.
[8]LI T,SAHU A K,TALWALKARA,et al.Federated learning:Challenges,methods,and future directions[J].IEEE Signal Processing Magazine,2020,37(3):50-60.
[9]GHOSH A,CHUNG J,YIN D,et al.An efficient framework for clustered federated learning[J].Advances in Neural Information Processing Systems,2020,33:19586-19597.
[10]DUAN M,LIU D,CHEN X,et al.Self-balancing federated learning with global imbalanced data in mobile systems[J].IEEE Transactions on Parallel and Distributed Systems,2020,32(1):59-71.
[11]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[12]RADOVANOVIĆ M,NANOPOULOS A,IVANOVIĆ M.Onthe existence of obstinate results in vector space models[C]//Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval.2010:186-193.
[13]SATTLER F,MÜLLER K R,SAMEK W.Clustered federated learning:Model-agnostic distributed multitask optimization under privacy constraints[J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(8):3710-3722.
[14]SATTLER F,MÜLLER K R,WIEGAND T,et al.On the byzantine robustness of clustered federated learning[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).IEEE,2020:8861-8865.
[15]WANG L,XU S,WANG X,et al.Addressing class imbalance in federated learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:10165-10173.
[16]SARKAR S,GHOSH A K.On perfect clustering of high dimension,low sample size data[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(9):2257-2272.
[17]BRIGGS C,FAN Z,ANDRAS P.Federated learning with hierarchical clustering of local updates to improve training on non-IID data[C]//2020 International Joint Conference on Neural Networks(IJCNN).IEEE,2020:1-9.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!