计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 302-313.doi: 10.11896/jsjkx.241200056
王鑫1,2, 陈坤1, 孙凌云2
WANG Xin1,2, CHEN Kun1, SUN Lingyun2
摘要: 联邦学习因具有隐私保护的天然特性,已经逐渐成为一个被广泛认可的分布式机器学习框架。但由于参与方数据分布的差异性,特别是呈现非独立同分布(Non-Independent and Identically Distributed,Non-IID)时,其面临着泛化性能不足、收敛性能下降、数据倾斜等严峻挑战。用预训练基础模型缓解Non-IID问题作为一种新颖的方法,演变出了各种各样的解决方案。对此,从预训练基础模型的角度,对现有工作进行了综述。首先介绍了基础模型方法,对典型的基础模型编码结构进行对比分析。其次从修改输入、基础模型部分结构再训练,以及参数高效微调3个角度,提出了一种新的分类方法。最后探讨了该类工作的核心难题和未来研究方向。
中图分类号:
| [1]KAIROUZ P,MCMAHAN H B,AVENT B,et al.Advancesand Open Problems in Federated Learning [J].Foundations and Trends© in Machine Learning,2021,14(1/2):1-210. [2]KARIMIREDDY S P,KALE S,MOHRI M,et al.SCAFFOLD:Stochastic Controlled Averaging for Federated Learning[C]//Proceedings of the 37th International Conference on Machine Learning.PMLR,2020:5132-5143. [3]ZHAO Y,LI M,LAI L,et al.Federated Learning with Non-IID Data [J].arXiv:1806.00582,2018. [4]GAO D,YAO X,YANG Q.A Survey on Heterogeneous Fede-rated Learning [J].arXiv:2210.04505,2022. [5]REN C,YU H,PENG H,et al.Advances and open challenges in federated learning with foundation models [J].arXiv:2024.15381,2020. [6]TAN A Z,YU H,CUI L,et al.Towards Personalized Federated Learning [J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(12):9587-9603. [7]LI S,LIU Y,FENG F,et al.HierFedPDP:Hierarchical federated learning with personalized differential privacy [J].Journal of Information Security Applications,2024,86:103890. [8]FU J,YE Q,HU H,et al.DPSUR:accelerating differentially private stochastic gradient descent using selective update and release [J].arXiv:2311.14056,2023. [9]CHEN J,LIU Z,HUANG X,et al.When Large Language Mo-dels Meet Personalization:Perspectives of Challenges and Opportunities [J].arXiv:2307.16376,2023. [10]BOMMASANI R,HUDSON D A,ADELI E,et al.On the opportunities and risks of foundation models [J].arXiv:2108.07258,2021. [11]NGUYEN J,WANG J,MALIK K,et al.Where to Begin?On the Impact of Pre-Training and Initialization in Federated Learning [J].arXiv:2206.15387,2022. [12]CHEN H Y,TU C H,LI Z,et al.On the Importance and Applicability of Pre-Training for Federated Learning [J].arXiv:2206.11488,2022. [13]ALEXEY D.An image is worth 16x16 words:Transformers for image recognition at scale [J].arXiv:2010.11929,2020. [14]QING X.A Comparison Study of Convolutional Neural Network and Recurrent Neural Network on Image Classification [C]//Proceedings of the 2022 10th International Conference on Information Technology:IoT and Smart City.ACM,2022:112-117. [15]LU K,XU Y,YANG Y.Comparison of the potential between transformer and CNN in image classification[C]//ICMLCA 2021:2nd International Conference on Machine Learning and Computer Application.2021:1-6. [16]MERKX D,FRANK S L.Comparing Transformers and RNNs on predicting human sentence processing data [J].arXiv:2005.09471,2020. [17]QU L,ZHOU Y,LIANG P P,et al.Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning [J].arXiv:2106.06047,2021. [18]XU P,WANG Z,MEI J,et al.FedConv:Enhancing Convolu-tional Neural Networks for Handling Data Heterogeneity in Federated Learning [J].arXiv:2310.04412,2023. [19]TAY Y,DEHGHANI M,GUPTA J,et al.Are Pre-trained Convolutions Better than Pre-trained Transformers? [J].arXiv:2105.03322,2021. [20]CAO H,WANG Y,CHEN J,et al.Swin-Unet:Unet-Like Pure Transformer for Medical Image Segmentation[C]//Computer Vision-ECCV 2022 Workshops.Springer,2023:205-218. [21]GUO J,HAN K,WU H,et al.CMT:Convolutional Neural Networks Meet Vision Transformers [J].arXiv:2107.06263,2021. [22]PENG B,ALCAIDE E,ANTHONY Q,et al.RWKV:Reinven-ting RNNs for the Transformer Era [J].arXiv:2305.13048,2023. [23]DAI Z,YANG Z,YANG Y,et al.Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context [J].arXiv:1901.02860,2019. [24]TAN Y,LONG G,MA J,et al.Federated learning from pre-trained models:A contrastive learning approach [J].Advances in Neural Information Processing Systems,2022,35:19332-19344. [25]MA X,LIU J,WANG J,et al.FedID:Federated Interactive Distillation for Large-Scale Pretraining Language Models[C]//Conference Association for Computational Linguistics.2023:8566-8577. [26]ZHANG L,SHEN L,DING L,et al.Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning [J].arXiv:2203.09249,2022. [27]ZHANG F,KUANG K,CHEN L,et al.Federated unsupervised representation learning [J].Frontiers of Information Technology & Electronic Engineering,2023,24(8):1181-1193. [28]ZHUANG W,GAN X,WEN Y,et al.Collaborative Unsuper-vised Visual Representation Learning from Decentralized Data [J].arXiv:2108.06492,2021. [29]ZHUANG W,WEN Y,ZHANG S.Divergence-aware Federated Self-Supervised Learning [J].arXiv:2204.04385,2022. [30]YAN R,QU L,WEI Q,et al.Label-Efficient Self-SupervisedFederated Learning for Tackling Data Heterogeneity in Medical Imaging [J].IEEE Transactions on Medical Imaging,2023,42(7):1932-1943. [31]KHOWAJA S A,DEV K,ANWAR S M,et al.SelfFed:Self-supervised Federated Learning for Data Heterogeneity and Label Scarcity in IoMT [J].arXiv:2307.01514,2023. [32]RADFORD A,KIM J W,HALLACY C,et al.Learning Transferable Visual Models From Natural Language Supervision[C]//Proceedings of the 38th International Conference on Machine Learning.PMLR,2021:8748-8763. [33]RAMESH G V,CHENNUPATI G,RAO M,et al.Federated Representation Learning for Automatic Speech Recognition [J].arXiv:2308.02013,2023. [34]LE H Q,QIAO Y,NGUYEN L X,et al.Federated multimodal learning for iot applications:A contrastive learning approach[C]//2023 24st Asia-Pacific Network Operations and Management Symposium(APNOMS).IEEE,2023:201-206. [35]NI L,SONG C,ZHAO H,et al.Personalized Medical Federated Learning Based on Mutual Knowledge Distillation in Object Heterogeneous Environment[C]//Blockchain and Web3 Technology Innovation and Application Exchange Conference.Springer,2024:362-374. [36]YANG D,XU Z,LI W,et al.Federated semi-supervised learning for COVID region segmentation in chest CT using multi-national data from China,Italy,Japan [J].Medical Image Analysis,2021,70:101992. [37]NGUYEN D C,DING M,PATHIRANA P N,et al.Federated learning for COVID-19 detection with generative adversarial networks in edge cloud computing [J].IEEE Internet of Things Journal,2021,9(12):10257-10271. [38]LU M Y,CHEN R J,KONG D,et al.Federated learning forcomputational pathology on gigapixel whole slide images [J].Medical Image Analysis,2022,76:102298. [39]REISSER M,LOUIZOS C,GAVVES E,et al.Federated Mixture of Experts [J].arXiv:2107.06724,2021. [40]MEI H,CAI D,ZHOU A,et al.FedMoE:Personalized Federated Learning via Heterogeneous Mixture of Experts [J].arXiv:2408.11304,2024. [41]YI L,YU H,REN C,et al.pFedMoE:Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Persona-lized Federated Learning [J].arXiv:2402.01350,2024. [42]DOU S,ZHOU E,LIU Y,et al.LoRAMoE:Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin [J].arXiv:2312.09979,2023. [43]ZADOURI T,ÜSTÜN A,AHMADIAN A,et al.Pushing Mixture of Experts to the Limit:Extremely Parameter Efficient MoE for Instruction Tuning [J].arXiv:2309.05444,2023. [44]FAN T,KANG Y,MA G,et al.FedCoLLM:A Parameter-Efficient Federated Cotuning Framework for Large and Small Language Models [J].arXiv:2411.11707,2024. [45]LI J,LYU L,ISO D,et al.MocoSFL:enabling cross-client collaborative self-supervised learning[C]//The Eleventh International Conference on Learning Representations.2022:1-13. [46]YIN W,XU M,LI Y,et al.LLM as a system service on mobile devices [J].arXiv:2403.11805,2024. [47]ZHANG K,DING N,QI B,et al.CRaSh:Clustering,Removing,and Sharing Enhance Fine-tuning without Full Large Language Model [C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:9612-9637. [48]LYU C,NIU C,GU R,et al.Walle:An {End-to-End},{General-Purpose},and {Large-Scale} Production System for {Device-Cloud} Collaborative Machine Learning[C]//16th USENIX Symposium on Operating Systems Design and Implementation(OSDI 22).2022:249-265. [49]QIAN Y,LI F,JI X,et al.EPS-MoE:Expert Pipeline Scheduler for Cost-Efficient MoE Inference [J].arXiv:2410.12247,2024. [50]JIN H,WU Y.CE-CoLLM:Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration [J].arXiv:2411.02829,2024. [51]WANG Q,LI Q,WANG K,et al.Efficient federated learning for fault diagnosis in industrial cloud-edge computing [J].Computing,2021,103(10):2319-2337. [52]LIALIN V,DESHPANDE V,RUMSHISKY A.Scaling Down to Scale Up:A Guide to Parameter-Efficient Fine-Tuning [J].arXiv:2303.15647,2023. [53]CAI H,GAN C,ZHU L,et al.Tinytl:Reduce memory,not parameters for efficient on-device learning [J].Advances in Neural Information Processing Systems,2020,33:11285-11297. [54]ZAKEN E B,RAVFOGEL S,GOLDBERG Y.Bitfit:Simple parameter-efficient fine-tuning for transformer-based masked language-models [J].arXiv:2106.10199,2021. [55]SUN T,HE Z,ZHU Q,et al.Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning [J].arXiv:2210.07565,2022. [56]ZHANG X,LI M,CHANG X,et al.FedYolo:Augmenting Fe-derated Learning with Pretrained Transformers [J].arXiv:2307.04905,2023. [57]HU K,WU J,WENG L,et al.A novel federated learning approach based on the confidence of federated Kalman filters [J].International Journal of Machine Learning and Cybernetics,2021,12(12):3607-3627. [58]LIU P,YUAN W,FU J,et al.Pre-train,prompt,and predict:A systematic survey of prompting methods in natural language processing [J].ACM Computing Surveys,2023,55(9):1-35. [59]JIANG Z,XU F F,ARAKI J,et al.How can we know what language models know? [J].Transactions of the Association for Computational Linguistics,2020,8:423-438. [60]WALLACE E,FENG S,KANDPAL N,et al.Universal adversarial triggers for attacking and analyzing NLP [J].arXiv:1908.07125,2019. [61]TSIMPOUKELLI M,MENICK J L,CABI S,et al.Multimodal few-shot learning with frozen language models [J].Advances in Neural Information Processing Systems,2021,34:200-212. [62]LI G,WU W,SUN Y,et al.Visual Prompt Based Personalized Federated Learning [J].arXiv:2303.08678,2023. [63]YANG F E,WANG C Y,WANG Y C F.Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation [J].arXiv:2308.15367,2023. [64]ZHAO H,DU W,LI F,et al.FedPrompt:Communication-Efficient and Privacy-Preserving Prompt Tuning in Federated Learning[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2023).2023:1-5. [65]GUO T,GUO S,WANG J,et al.PromptFL:Let Federated Participants Cooperatively Learn Prompts Instead of Models-Fede-rated Learning in Age of Foundation Model [J].IEEE Transactions on Mobile Computing,2024,23(5):5179-5194. [66]LU W,HU X,WANG J,et al.FedCLIP:Fast Generalization and Personalization for CLIP in Federated Learning [J].arXiv:2302.13485,2023. [67]CHEN H,ZHANG Y,KROMPASS D,et al.FedDAT:An Approach for Foundation Model Finetuning in Multi-Modal Hetero-geneous Federated Learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:11285-11293. [68]JIA J,LI K,MALEK M,et al.Joint Federated Learning andPersonalization for on-Device ASR[C]//2023 IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).2023:1-8. [69]KHALID U,IQBAL H,VAHIDIAN S,et al.CEFHRI:A Communication Efficient Federated Learning Framework for Recognizing Industrial Human-Robot Interaction[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).2023:10141-10148. [70]KWAK N,KIM T.X-PEFT:eXtremely Parameter-EfficientFine-Tuning for Extreme Multi-Profile Scenarios [J].arXiv:2401.16137,2024. [71]CUI X,QIU X,YE Z,et al.Survey of communication overhead of federated learning [J].Journal of Computer Applications,2022,42(2):333-342. [72]CHO Y J,WANG J,CHIRVOLU T,et al.Communication-Efficient and Model-Heterogeneous Personalized Federated Lear-ning via Clustered Knowledge Transfer [J].IEEE Journal of Selected Topics in Signal Processing,2023,17(1):234-247. [73]NI X,SHEN X,ZHAO H.Federated optimization via knowledge codistillation [J].Expert Systems with Applications,2022,191:116310. [74]MICHIELI U,TOLDO M,OZAY M.Federated Learning viaAttentive Margin of Semantic Feature Representations [J].IEEE Internet of Things Journal,2023,10(2):1517-1535. [75]MAO Y,ZHAO Z,YAN G,et al.Communication-efficient fe-derated learning with adaptive quantization [J].ACM Transactions on Intelligent Systems and technology,2022,13(4):1-26. [76]BASTIANELLO N,LIU C,JOHANSSON K H.Enhancing Privacy in Federated Learning through Local Training [J].arXiv:2403.17572,2024. [77]AYACHE G,DASSARI V,ROUAYHEB S E.Walk for Lear-ning:A Random Walk Approach for Federated Learning From Heterogeneous Data [J].IEEE Journal on Selected Areas in Communications,2023,41(4):929-940. [78]YE M,FANG X,DU B,et al.Heterogeneous Federated Lear-ning:State-of-the-art and Research Challenges [J].ACM Computing Surveys,2023,56(3 ):79. [79]CHAI Z,FAYYAZ H,FAYYAZ Z,et al.Towards taming the resource and data heterogeneity in federated learning[C]//2019 USENIX Conference on Operational MachineLearning(OpML 19).2019:19-21. [80]WU W T,WU Y L,LIN W W,et al.Horizontal FederatedLearning:Research Status,System Applications and Open Challenges [J].Chinese Journal of Computers,2025,48(1):35-67. [81]XIAO Z,CHEN Z,LIU S,et al.Fed-GraB:Federated long-tailed learning with self-adjusting gradient balancer [J].arXiv:2310.07587,2023. [82]WU D,ULLAH R,HARVEY P,et al.FedAdapt:Adaptive Offloading for IoT Devices in Federated Learning [J].IEEE Internet of Things Journal,2022,9(21):20889-20901. [83]CHUNG W C,CHANG Y C,HSU C H,et al.Federated feature concatenate method for heterogeneous computing in Federated Learning [J].Computers,Materials Continua,2023,75(1):351-370. [84]ALIZADEH K,MIRZADEH I,BELENKO D,et al.LLM ina flash:Efficient Large Language Model Inference with Limited Memory [J].arXiv:2312.11514,2023. [85]ZHU L,LIU Z,HAN S.Deep leakage from gradients [C]//Pro-ceedings of the 33rd International Conference on Neural Information Processing Systems.2019:14774-14784. [86]PAN X,ZHANG M,JI S,et al.Privacy Risks of General-Purpose Language Models[C]//IEEE Symposium on Security and Privacy(SP).2020:1314-1331. [87]GU Y H,BAI Y B.Survey on Security and Privacy of Federated Learning Models [J].Ruan Jian Xue Bao/Journal of Software,2023,34(6):2833-2864. [88]TANG L T,CHEN Z N,ZHANG L F,et al.Research Progress of Privacy Issues in Federated Learning [J].Ruan Jian Xue Bao/Journal of Software,2023,34(1):197-229. [89]TOBABEN M,SHYSHEYA A,BRONSKILL J,et al.On theEfficacy of Differentially Private Few-shot Image Classification [J].arXiv:2302.01190,2023. [90]XU R,BARACALDO N,ZHOU Y,et al.HybridAlpha:An Efficient Approach for Privacy-Preserving Federated Learning[C]//Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security.ACM,2019:13-23. [91]HE J,ZHOU C,MA X,et al.Towards a unified view of parameter-efficient transfer learning [J].arXiv:2110.04366,2021. [92]DING N,QIN Y,YANG G,et al.Parameter-efficient fine-tuning of large-scale pre-trained language models [J].Nature Machine Intelligence,2023,5(3):220-235. [93]HAN X,ZHANG Z,DING N,et al.Pre-trained models:Past,present and future [J].AI Open,2021,2:225-250. [94]CHEN S,LI B.Towards Optimal Multi-Modal Federated Lear-ning on Non-IID Data with Hierarchical Gradient Blending[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.2022:1469-1478. [95]LIN Y M,GAO Y,GONG M G,et al.Federated Learning on Multimodal Data:A Comprehensive Survey [J].Machine Intelligence Research,2023,20(4):539-553. |
|
||