计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 197-206.doi: 10.11896/jsjkx.250100068
李林昊1,2,3, 许亚楠1, 董永峰1,2,3, 王振1,2,3
LI Linhao1,2,3, XU Yanan1, DONG Yongfeng1,2,3, WANG Zhen1,2,3
摘要: 在分布式环境中,数据异质性表现为数据特征差异。专家模型协同存在知识孤立与任务分配不合理的问题,导致专家训练效果参差不齐,难以充分发挥各模型优势,使得整体性能受限。针对这些问题,提出了一种基于多专家协同和信息交互的社会化学习框架(Social Learning Based on Multi-expert Collaboration and Information Interaction,MECII)。该框架结合混合专家模型和社会化学习思想,通过多专家协同、门控网络、自适应信息交互和门控选择约束这四大模块,优化了专家间的知识共享与互补机制,有效解决了分布式学习中的数据异质性和专家协同问题。MECII通过精准的专家选择与任务分配,促进了专家之间的信息流动,使每个专家在处理特定数据时的准确率得到提升,增强了整体模型性能。实验结果表明,MECII在CIFAR-10和CIFAR-100数据集上相比传统的联邦学习基准方法有显著的性能提升,特别是在数据异质性场景下,与先进的FedL2P方法相比,MECII将分类准确率分别提高了6.69个百分点和5.13个百分点,且有效优化了每个专家的准确率。实验结果验证了MECII在促进专家协作和提升个体精度方面具有显著优势。
中图分类号:
| [1]AKERS R L,JENNINGS W G.[M]//The Handbook of Crimi-nological Theory.2015:230-240. [2]XIE G Q,ZHONG B W,LI Y.Distributed Adaptive Multi-agentRendezvous Control Based on Average Consensus Protocol[J].Computer Science,2024,51(5):242-249. [3]JACOBS R A,JORDAN M I,NOWLAN S J,et al.Adaptive mixtures of local experts[J].Neural Computation,1991,3(1):79-87. [4]KONEČKOVINÝ J,MCMAHAN H B,YU F X,et al.Federated Learning:Strategies for Improving Communication Efficiency[J].arXiv:1610.05492,2016. [5]ZHU G H,QI J H,ZHU Z N,et al.Research on progressive deep ensemble architecture search algorithm[J].Chinese Journal of Computers,2023,46(10):2041-2065. [6]LIU Q,SHI M L,HUANG Z G,et al.Multi-agent collaborative reinforcement learning method based on dual-perspective mode-ling[J].Chinese Journal of Computers,2024,47(7):1582-1594. [7]BANDURA A,WALTERS R H.Social learning theory[M].Englewood Cliffs,NJ:Prentice hall,1977. [8]BADGHISH S,SHAIK A S,SAHORE N,et al.Can transac-tional use of AI-controlled voice assistants for service delivery pickup pace in the near future? A social learning theory(SLT) perspective[J].Technological Forecasting and Social Change,2024,198:122972. [9]PARKER L E.Distributed Intelligence:Overview of the Field and its Application in Multi-Robot Systems[C]//AAAI Fall Symposium:Regarding the Intelligence in Distributed Intelligent Systems.2007:1-6. [10]WANG X W,FENG X,YU H Q.Multi-agent Cooperative Algo-rithm for Obstacle Clearance Based on Deep Deterministic Policy Gradient and Attention Critic[J].Computer Science,2024,51(7):319-326. [11]WANG X,ZHAO C,HUANG T W,et al.Cooperative learning of multi-agent systems via reinforcement learning[J].IEEE Transactions on Signal and Information Processing over Networks,2023,9:13-23. [12]AHMED I,SYED M A,MAARUF M,et al.Distributed computing in multi-agent systems:a survey of decentralized machine learning approaches[J].Computing,2025,107(1):2. [13]HONG S,ZHENG X,CHEN J,et al.Metagpt:Meta programming for a multi-agent collaborative framework[J].arXiv:2308.00352,2023. [14]ZHANG M Y,JIN Z,LIU K.Virtual regret advantage self-playmethod for cooperative-competitive hybrid multi-agent systems[J].Journal of Software,2024,35(2):739-757. [15]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282. [16]KAIROUZ P,MCMAHAN H B,AVENT B,et al.Advances and open problems in federated learning[J].Foundations and Trends© in Machine Learning,2021,14(1/2):1-210. [17]SATTLER F,WIEDEMANN S,MÜLLER K R,et al.Robust and communication-efficient federated learning from non-iid data[J].IEEE Transactions on Neural Networks and Learning Systems,2019,31(9):3400-3413. [18]DINH C T,TRAN N,NGUYEN J.Personalized federatedlearning with moreau envelopes[J].Advances in Neural Information Processing Systems,2020,33:21394-21405. [19]WANG J,LIU Q,LIANG H,et al.Tackling the objective inconsistency problem in heterogeneous federated optimization[J].Advances in Neural Information Processing Systems,2020,33:7611-7623. [20]LI T,SAHU A K,ZAHEER M,et al.Federated optimization in heterogeneous networks[J].Proceedings of Machine Learning and Systems,2020,2:429-450. [21]LI Q,HE B,SONG D.Model-contrastive federated learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10713-10722. [22]QU Z,LI X,DUAN R,et al.Generalized federated learning via sharpness aware minimization[C]//International Conference on Machine Learning.PMLR,2022:18250-18280. [23]SHI Y,LIANG J,ZHANG W,et al.Towards understanding and mitigating dimensional collapse in heterogeneous federated learning[J].arXiv:2210.00226,2022. [24]LEE R,KIM M,LI D,et al.Fedl2p:Federated learning to personalize[C]//Advances in Neural Information Processing Systems.2024. [25]HUANG W,YE M,SHI Z,et al.Federated learning for genera-lization,robustness,fairness:A survey and benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):9387-9406. [26]ZHANG R L,DU J H,YIN H.Client selection algorithm incross-device federated learning[J].Journal of Software,2024,35(12):5725-5740. [27]WANG Y,FU H,KANAGAVELU R,et al.An aggregation-free federated learning for tackling data heterogeneity[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:26233-26242. [28]SHAZZER N,MIRHOSEINI A,MAZIARZ K,et al.Outra-geously large neural networks:The sparsely-gated mixture-of-experts layer[J].arXiv:1701.06538,2017. [29]YANG Z,DAI Z,SALAKHUTDINOV R,et al.Breaking the softmax bottleneck:A high-rank RNN language model[J].ar-Xiv:1711.03953,2017. [30]FEDUS W,ZOPH B,SHAZEER N.Switch transformers:Sca-ling to trillion parameter models with simple and efficient sparsity[J].Journal of Machine Learning Research,2022,23(120):1-39. [31]LI Y,JIANG S,HU B,et al.Uni-moe:Scaling unified multimodal llms with mixture of experts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2025,47(5):3424-3439. [32]JORDAN M I,JACOBS R A.Hierarchical mixtures of experts and the EM algorithm[J].Neural Computation,1994,6(2):181-214. [33]SHAZEER N,STERN M.Adafactor:Adaptive learning rateswith sublinear memory cost[C]//International Conference on Machine Learning.PMLR,2018:4596-4604. [34]YU J,ZHUGE Y,ZHANG L,et al.Boosting continual learning of vision-language models via mixture-of-experts adapters[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:23219-23230. [35]ROSENBAUM C,KLINGER T,RIEMER M.Routing net-works:Adaptive selection of non-linear functions for multi-task learning[J].arXiv:1711.01239,2017. [36]DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[C]//Advances in Neural Information Processing Systems.2019. [37]DING X,ZHANG X,HAN J,et al.Scaling up your kernels to 31x31:Revisiting large kernel design in cnns[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11963-11975. |
|
||