解决联邦学习Non-IID问题的基础模型方法综述

doi:10.11896/jsjkx.241200056

摘要/Abstract

摘要： 联邦学习因具有隐私保护的天然特性,已经逐渐成为一个被广泛认可的分布式机器学习框架。但由于参与方数据分布的差异性,特别是呈现非独立同分布(Non-Independent and Identically Distributed,Non-IID)时,其面临着泛化性能不足、收敛性能下降、数据倾斜等严峻挑战。用预训练基础模型缓解Non-IID问题作为一种新颖的方法,演变出了各种各样的解决方案。对此,从预训练基础模型的角度,对现有工作进行了综述。首先介绍了基础模型方法,对典型的基础模型编码结构进行对比分析。其次从修改输入、基础模型部分结构再训练,以及参数高效微调3个角度,提出了一种新的分类方法。最后探讨了该类工作的核心难题和未来研究方向。

关键词: 联邦学习, 分布式系统, 隐私计算, 非独立同分布数据问题, 基础模型

Abstract: Federated learning,due to its inherent privacy-preserving nature,has gradually become a widely recognized framework for distributed machine learning.However,it faces significant challenges such as insufficient generalization performance,degraded convergence efficiency,and data skew,particularly in the presence of Non-IID data.Using pre-trained foundation models to mitigate Non-IID issues has emerged as a novel approach,leading to the development of various solutions.This review examines exis-ting works from the perspective of pre-trained foundation models.Firstly,it introduces foundation model methodologies and provides a comparative analysis of typical foundation model architectures.Secondly,a new classification framework is proposed from three perspectives:modifying inputs,retraining parts of the foundation model,and parameter-efficient fine-tuning.Finally,it explores the core challenges of this type of work and outlines future research directions.

Key words: Federated learning, Distributed system, Privacy computing, Non-IID, Foundation models

中图分类号:

TP181

王鑫, 陈坤, 孙凌云. 解决联邦学习Non-IID问题的基础模型方法综述[J]. 计算机科学, 2025, 52(12): 302-313. https://doi.org/10.11896/jsjkx.241200056

WANG Xin, CHEN Kun, SUN Lingyun. Research on Foundation Model Methods for Addressing Non-IID Issues in Federated Learning[J]. Computer Science, 2025, 52(12): 302-313. https://doi.org/10.11896/jsjkx.241200056

参考文献

[1]KAIROUZ P,MCMAHAN H B,AVENT B,et al.Advancesand Open Problems in Federated Learning [J].Foundations and Trends© in Machine Learning,2021,14(1/2):1-210.
[2]KARIMIREDDY S P,KALE S,MOHRI M,et al.SCAFFOLD:Stochastic Controlled Averaging for Federated Learning[C]//Proceedings of the 37th International Conference on Machine Learning.PMLR,2020:5132-5143.
[3]ZHAO Y,LI M,LAI L,et al.Federated Learning with Non-IID Data [J].arXiv:1806.00582,2018.
[4]GAO D,YAO X,YANG Q.A Survey on Heterogeneous Fede-rated Learning [J].arXiv:2210.04505,2022.
[5]REN C,YU H,PENG H,et al.Advances and open challenges in federated learning with foundation models [J].arXiv:2024.15381,2020.
[6]TAN A Z,YU H,CUI L,et al.Towards Personalized Federated Learning [J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(12):9587-9603.
[7]LI S,LIU Y,FENG F,et al.HierFedPDP:Hierarchical federated learning with personalized differential privacy [J].Journal of Information Security Applications,2024,86:103890.
[8]FU J,YE Q,HU H,et al.DPSUR:accelerating differentially private stochastic gradient descent using selective update and release [J].arXiv:2311.14056,2023.
[9]CHEN J,LIU Z,HUANG X,et al.When Large Language Mo-dels Meet Personalization:Perspectives of Challenges and Opportunities [J].arXiv:2307.16376,2023.
[10]BOMMASANI R,HUDSON D A,ADELI E,et al.On the opportunities and risks of foundation models [J].arXiv:2108.07258,2021.
[11]NGUYEN J,WANG J,MALIK K,et al.Where to Begin?On the Impact of Pre-Training and Initialization in Federated Learning [J].arXiv:2206.15387,2022.
[12]CHEN H Y,TU C H,LI Z,et al.On the Importance and Applicability of Pre-Training for Federated Learning [J].arXiv:2206.11488,2022.
[13]ALEXEY D.An image is worth 16x16 words:Transformers for image recognition at scale [J].arXiv:2010.11929,2020.
[14]QING X.A Comparison Study of Convolutional Neural Network and Recurrent Neural Network on Image Classification [C]//Proceedings of the 2022 10th International Conference on Information Technology:IoT and Smart City.ACM,2022:112-117.
[15]LU K,XU Y,YANG Y.Comparison of the potential between transformer and CNN in image classification[C]//ICMLCA 2021:2nd International Conference on Machine Learning and Computer Application.2021:1-6.
[16]MERKX D,FRANK S L.Comparing Transformers and RNNs on predicting human sentence processing data [J].arXiv:2005.09471,2020.
[17]QU L,ZHOU Y,LIANG P P,et al.Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning [J].arXiv:2106.06047,2021.
[18]XU P,WANG Z,MEI J,et al.FedConv:Enhancing Convolu-tional Neural Networks for Handling Data Heterogeneity in Federated Learning [J].arXiv:2310.04412,2023.
[19]TAY Y,DEHGHANI M,GUPTA J,et al.Are Pre-trained Convolutions Better than Pre-trained Transformers? [J].arXiv:2105.03322,2021.
[20]CAO H,WANG Y,CHEN J,et al.Swin-Unet:Unet-Like Pure Transformer for Medical Image Segmentation[C]//Computer Vision－ECCV 2022 Workshops.Springer,2023:205-218.
[21]GUO J,HAN K,WU H,et al.CMT:Convolutional Neural Networks Meet Vision Transformers [J].arXiv:2107.06263,2021.
[22]PENG B,ALCAIDE E,ANTHONY Q,et al.RWKV:Reinven-ting RNNs for the Transformer Era [J].arXiv:2305.13048,2023.
[23]DAI Z,YANG Z,YANG Y,et al.Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context [J].arXiv:1901.02860,2019.
[24]TAN Y,LONG G,MA J,et al.Federated learning from pre-trained models:A contrastive learning approach [J].Advances in Neural Information Processing Systems,2022,35:19332-19344.
[25]MA X,LIU J,WANG J,et al.FedID:Federated Interactive Distillation for Large-Scale Pretraining Language Models[C]//Conference Association for Computational Linguistics.2023:8566-8577.
[26]ZHANG L,SHEN L,DING L,et al.Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning [J].arXiv:2203.09249,2022.
[27]ZHANG F,KUANG K,CHEN L,et al.Federated unsupervised representation learning [J].Frontiers of Information Technology & Electronic Engineering,2023,24(8):1181-1193.
[28]ZHUANG W,GAN X,WEN Y,et al.Collaborative Unsuper-vised Visual Representation Learning from Decentralized Data [J].arXiv:2108.06492,2021.
[29]ZHUANG W,WEN Y,ZHANG S.Divergence-aware Federated Self-Supervised Learning [J].arXiv:2204.04385,2022.
[30]YAN R,QU L,WEI Q,et al.Label-Efficient Self-SupervisedFederated Learning for Tackling Data Heterogeneity in Medical Imaging [J].IEEE Transactions on Medical Imaging,2023,42(7):1932-1943.
[31]KHOWAJA S A,DEV K,ANWAR S M,et al.SelfFed:Self-supervised Federated Learning for Data Heterogeneity and Label Scarcity in IoMT [J].arXiv:2307.01514,2023.
[32]RADFORD A,KIM J W,HALLACY C,et al.Learning Transferable Visual Models From Natural Language Supervision[C]//Proceedings of the 38th International Conference on Machine Learning.PMLR,2021:8748-8763.
[33]RAMESH G V,CHENNUPATI G,RAO M,et al.Federated Representation Learning for Automatic Speech Recognition [J].arXiv:2308.02013,2023.
[34]LE H Q,QIAO Y,NGUYEN L X,et al.Federated multimodal learning for iot applications:A contrastive learning approach[C]//2023 24st Asia-Pacific Network Operations and Management Symposium(APNOMS).IEEE,2023:201-206.
[35]NI L,SONG C,ZHAO H,et al.Personalized Medical Federated Learning Based on Mutual Knowledge Distillation in Object Heterogeneous Environment[C]//Blockchain and Web3 Technology Innovation and Application Exchange Conference.Springer,2024:362-374.
[36]YANG D,XU Z,LI W,et al.Federated semi-supervised learning for COVID region segmentation in chest CT using multi-national data from China,Italy,Japan [J].Medical Image Analysis,2021,70:101992.
[37]NGUYEN D C,DING M,PATHIRANA P N,et al.Federated learning for COVID-19 detection with generative adversarial networks in edge cloud computing [J].IEEE Internet of Things Journal,2021,9(12):10257-10271.
[38]LU M Y,CHEN R J,KONG D,et al.Federated learning forcomputational pathology on gigapixel whole slide images [J].Medical Image Analysis,2022,76:102298.
[39]REISSER M,LOUIZOS C,GAVVES E,et al.Federated Mixture of Experts [J].arXiv:2107.06724,2021.
[40]MEI H,CAI D,ZHOU A,et al.FedMoE:Personalized Federated Learning via Heterogeneous Mixture of Experts [J].arXiv:2408.11304,2024.
[41]YI L,YU H,REN C,et al.pFedMoE:Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Persona-lized Federated Learning [J].arXiv:2402.01350,2024.
[42]DOU S,ZHOU E,LIU Y,et al.LoRAMoE:Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin [J].arXiv:2312.09979,2023.
[43]ZADOURI T,ÜSTÜN A,AHMADIAN A,et al.Pushing Mixture of Experts to the Limit:Extremely Parameter Efficient MoE for Instruction Tuning [J].arXiv:2309.05444,2023.
[44]FAN T,KANG Y,MA G,et al.FedCoLLM:A Parameter-Efficient Federated Cotuning Framework for Large and Small Language Models [J].arXiv:2411.11707,2024.
[45]LI J,LYU L,ISO D,et al.MocoSFL:enabling cross-client collaborative self-supervised learning[C]//The Eleventh International Conference on Learning Representations.2022:1-13.
[46]YIN W,XU M,LI Y,et al.LLM as a system service on mobile devices [J].arXiv:2403.11805,2024.
[47]ZHANG K,DING N,QI B,et al.CRaSh:Clustering,Removing,and Sharing Enhance Fine-tuning without Full Large Language Model [C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:9612-9637.
[48]LYU C,NIU C,GU R,et al.Walle:An {End-to-End},{General-Purpose},and {Large-Scale} Production System for {Device-Cloud} Collaborative Machine Learning[C]//16th USENIX Symposium on Operating Systems Design and Implementation(OSDI 22).2022:249-265.
[49]QIAN Y,LI F,JI X,et al.EPS-MoE:Expert Pipeline Scheduler for Cost-Efficient MoE Inference [J].arXiv:2410.12247,2024.
[50]JIN H,WU Y.CE-CoLLM:Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration [J].arXiv:2411.02829,2024.
[51]WANG Q,LI Q,WANG K,et al.Efficient federated learning for fault diagnosis in industrial cloud-edge computing [J].Computing,2021,103(10):2319-2337.
[52]LIALIN V,DESHPANDE V,RUMSHISKY A.Scaling Down to Scale Up:A Guide to Parameter-Efficient Fine-Tuning [J].arXiv:2303.15647,2023.
[53]CAI H,GAN C,ZHU L,et al.Tinytl:Reduce memory,not parameters for efficient on-device learning [J].Advances in Neural Information Processing Systems,2020,33:11285-11297.
[54]ZAKEN E B,RAVFOGEL S,GOLDBERG Y.Bitfit:Simple parameter-efficient fine-tuning for transformer-based masked language-models [J].arXiv:2106.10199,2021.
[55]SUN T,HE Z,ZHU Q,et al.Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning [J].arXiv:2210.07565,2022.
[56]ZHANG X,LI M,CHANG X,et al.FedYolo:Augmenting Fe-derated Learning with Pretrained Transformers [J].arXiv:2307.04905,2023.
[57]HU K,WU J,WENG L,et al.A novel federated learning approach based on the confidence of federated Kalman filters [J].International Journal of Machine Learning and Cybernetics,2021,12(12):3607-3627.
[58]LIU P,YUAN W,FU J,et al.Pre-train,prompt,and predict:A systematic survey of prompting methods in natural language processing [J].ACM Computing Surveys,2023,55(9):1-35.
[59]JIANG Z,XU F F,ARAKI J,et al.How can we know what language models know? [J].Transactions of the Association for Computational Linguistics,2020,8:423-438.
[60]WALLACE E,FENG S,KANDPAL N,et al.Universal adversarial triggers for attacking and analyzing NLP [J].arXiv:1908.07125,2019.
[61]TSIMPOUKELLI M,MENICK J L,CABI S,et al.Multimodal few-shot learning with frozen language models [J].Advances in Neural Information Processing Systems,2021,34:200-212.
[62]LI G,WU W,SUN Y,et al.Visual Prompt Based Personalized Federated Learning [J].arXiv:2303.08678,2023.
[63]YANG F E,WANG C Y,WANG Y C F.Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation [J].arXiv:2308.15367,2023.
[64]ZHAO H,DU W,LI F,et al.FedPrompt:Communication-Efficient and Privacy-Preserving Prompt Tuning in Federated Learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2023).2023:1-5.
[65]GUO T,GUO S,WANG J,et al.PromptFL:Let Federated Participants Cooperatively Learn Prompts Instead of Models-Fede-rated Learning in Age of Foundation Model [J].IEEE Transactions on Mobile Computing,2024,23(5):5179-5194.
[66]LU W,HU X,WANG J,et al.FedCLIP:Fast Generalization and Personalization for CLIP in Federated Learning [J].arXiv:2302.13485,2023.
[67]CHEN H,ZHANG Y,KROMPASS D,et al.FedDAT:An Approach for Foundation Model Finetuning in Multi-Modal Hetero-geneous Federated Learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:11285-11293.
[68]JIA J,LI K,MALEK M,et al.Joint Federated Learning andPersonalization for on-Device ASR[C]//2023 IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).2023:1-8.
[69]KHALID U,IQBAL H,VAHIDIAN S,et al.CEFHRI:A Communication Efficient Federated Learning Framework for Recognizing Industrial Human-Robot Interaction[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).2023:10141-10148.
[70]KWAK N,KIM T.X-PEFT:eXtremely Parameter-EfficientFine-Tuning for Extreme Multi-Profile Scenarios [J].arXiv:2401.16137,2024.
[71]CUI X,QIU X,YE Z,et al.Survey of communication overhead of federated learning [J].Journal of Computer Applications,2022,42(2):333-342.
[72]CHO Y J,WANG J,CHIRVOLU T,et al.Communication-Efficient and Model-Heterogeneous Personalized Federated Lear-ning via Clustered Knowledge Transfer [J].IEEE Journal of Selected Topics in Signal Processing,2023,17(1):234-247.
[73]NI X,SHEN X,ZHAO H.Federated optimization via knowledge codistillation [J].Expert Systems with Applications,2022,191:116310.
[74]MICHIELI U,TOLDO M,OZAY M.Federated Learning viaAttentive Margin of Semantic Feature Representations [J].IEEE Internet of Things Journal,2023,10(2):1517-1535.
[75]MAO Y,ZHAO Z,YAN G,et al.Communication-efficient fe-derated learning with adaptive quantization [J].ACM Transactions on Intelligent Systems and technology,2022,13(4):1-26.
[76]BASTIANELLO N,LIU C,JOHANSSON K H.Enhancing Privacy in Federated Learning through Local Training [J].arXiv:2403.17572,2024.
[77]AYACHE G,DASSARI V,ROUAYHEB S E.Walk for Lear-ning:A Random Walk Approach for Federated Learning From Heterogeneous Data [J].IEEE Journal on Selected Areas in Communications,2023,41(4):929-940.
[78]YE M,FANG X,DU B,et al.Heterogeneous Federated Lear-ning:State-of-the-art and Research Challenges [J].ACM Computing Surveys,2023,56(3 ):79.
[79]CHAI Z,FAYYAZ H,FAYYAZ Z,et al.Towards taming the resource and data heterogeneity in federated learning[C]//2019 USENIX Conference on Operational MachineLearning(OpML 19).2019:19-21.
[80]WU W T,WU Y L,LIN W W,et al.Horizontal FederatedLearning:Research Status,System Applications and Open Challenges [J].Chinese Journal of Computers,2025,48(1):35-67.
[81]XIAO Z,CHEN Z,LIU S,et al.Fed-GraB:Federated long-tailed learning with self-adjusting gradient balancer [J].arXiv:2310.07587,2023.
[82]WU D,ULLAH R,HARVEY P,et al.FedAdapt:Adaptive Offloading for IoT Devices in Federated Learning [J].IEEE Internet of Things Journal,2022,9(21):20889-20901.
[83]CHUNG W C,CHANG Y C,HSU C H,et al.Federated feature concatenate method for heterogeneous computing in Federated Learning [J].Computers,Materials Continua,2023,75(1):351-370.
[84]ALIZADEH K,MIRZADEH I,BELENKO D,et al.LLM ina flash:Efficient Large Language Model Inference with Limited Memory [J].arXiv:2312.11514,2023.
[85]ZHU L,LIU Z,HAN S.Deep leakage from gradients [C]//Pro-ceedings of the 33rd International Conference on Neural Information Processing Systems.2019:14774-14784.
[86]PAN X,ZHANG M,JI S,et al.Privacy Risks of General-Purpose Language Models[C]//IEEE Symposium on Security and Privacy(SP).2020:1314-1331.
[87]GU Y H,BAI Y B.Survey on Security and Privacy of Federated Learning Models [J].Ruan Jian Xue Bao/Journal of Software,2023,34(6):2833-2864.
[88]TANG L T,CHEN Z N,ZHANG L F,et al.Research Progress of Privacy Issues in Federated Learning [J].Ruan Jian Xue Bao/Journal of Software,2023,34(1):197-229.
[89]TOBABEN M,SHYSHEYA A,BRONSKILL J,et al.On theEfficacy of Differentially Private Few-shot Image Classification [J].arXiv:2302.01190,2023.
[90]XU R,BARACALDO N,ZHOU Y,et al.HybridAlpha:An Efficient Approach for Privacy-Preserving Federated Learning[C]//Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security.ACM,2019:13-23.
[91]HE J,ZHOU C,MA X,et al.Towards a unified view of parameter-efficient transfer learning [J].arXiv:2110.04366,2021.
[92]DING N,QIN Y,YANG G,et al.Parameter-efficient fine-tuning of large-scale pre-trained language models [J].Nature Machine Intelligence,2023,5(3):220-235.
[93]HAN X,ZHANG Z,DING N,et al.Pre-trained models:Past,present and future [J].AI Open,2021,2:225-250.
[94]CHEN S,LI B.Towards Optimal Multi-Modal Federated Lear-ning on Non-IID Data with Hierarchical Gradient Blending[C]//IEEE INFOCOM 2022－IEEE Conference on Computer Communications.2022:1469-1478.
[95]LIN Y M,GAO Y,GONG M G,et al.Federated Learning on Multimodal Data:A Comprehensive Survey [J].Machine Intelligence Research,2023,20(4):539-553.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed