计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 220800021-8.doi: 10.11896/jsjkx.220800021
王春东, 杜英琦, 莫秀良, 付浩然
WANG Chundong, DU Yingqi, MO Xiuliang, FU Haoran
摘要: 联邦学习(Federated Learning)的出现解决了传统机器学习中存在的“数据孤岛”问题,能够在保护客户端本地数据隐私的前提下进行集体模型的训练。当客户端数据为独立同分布(Independently Identically Distribution,IID)数据时,联邦学习能够达到近似于集中式机器学习的精确度。然而在现实场景下,由于客户端设备、地理位置等差异,往往存在客户端数据含有噪声数据以及非独立同分布(Non-IID)的情况。因此,提出了一种基于CutMix的联邦学习框架,即剪切增强联邦学习(CutMix Enhanced Federated Learning,CEFL)。首先通过数据清洗算法过滤掉噪声数据,再通过基于CutMix的数据增强方式进行训练,可以有效提高联邦学习模型在真实场景下的学习精度。在 MNIST和 CIFAR-10标准数据集上进行了实验,相比传统的联邦学习算法,剪切增强联邦学习在Non-IID数据下对模型的准确率分别提升了23%和19%。
中图分类号:
[1]RAGHU M,SCHMIDT E.A Survey of Deep Learning forScientific Discovery[J].arXiv:2003.11755,2020. [2]POUYANFAR S,SADI S,YAN Q Y,et al.A Survey on Deep Learning:Algorithms,Techniques,and Applications[J].ACM Computing Surveys,2019,51(5):1-36. [3]LEVINE S,PASTOR P,KRIZHEVSKY A,et al.Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection[J].arXiv:1603.02199,2016. [4]MCMAHAN B,MOORE E,RAMAGE D,et al.Communication-Efficient Learning of Deep Networks from Decentralized Data[C]//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics.2017:1273-128. [5]YANG Q,LIU Y,CHEN T.Federated Machine Learning:Concept and Applications[J].ACM Transactions on Intelligent Systems and Technology,2019,10(2):1-19. [6]ZHU H,XU J,LIU S,et al.Federated learning on non-IID data:A survey[J].Neurocomputing,2021,465:371-390. [7]FALLAH A,MOKHTARI A A,OZDAGLAR A A,et al.Personalized Federated Learning with Theoretical Guarantees:A Model-Agnostic Meta-Learning Approach[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Vancouver,BC,Canada,2020:3557-3568. [8]ARIVAZHAGAN M G,AGGARWAL V,SINGH A K,et al.Federated Learning with Personalization Layers[J].arXiv:1912.00818,2019. [9]GHOSH A,CHUNG A A,YIN J A,etal.An Efficient Framework for Clustered Federated Learning[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Vancouver,BC,Canada,2020:19586-19597. [10]WANG Y S,LIU Y A,MA W A,et al.Iterative Learning with Open-set Noisy Labels[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:8688-8696. [11]RETU,STAVROU G F A,LOCASTO A A,et al.Casting out Demons:Sanitizing Training Data for Anomaly Sensors[C]//2008 IEEE Symposium on Security and Privacy(sp 2008).2008:81-95. [12]XIE C,KOYEJO O,GUPTA I.Zeno:Distributed stochastic gradient descent with suspicion-based fault-tolerance[C]//36th International Conference on Machine Learning(ICML 2019).2019:11928-11944. [13]HAN B,YAO B A,YU Q A,et al.Co-Teaching:Robust Training of Deep Neural Networks with Extremely Noisy Labels[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems(NIPS’18).2018. [14]LI A R,ZHANG A A,WANG L A,et al.Privacy-Preserving Efficient Federated-Learning Model Debugging[J].IEEE Transactions on Parallel and Distributed Systems,2022,33(10):2291-2303. [15]LEMLEY,BAZRAFKAN J A,SHABAB.Smart Augmentation Learning an Optimal Data Augmentation Strategy[J].IEEE Access,2017,5:5858-586. [16]CUBUK E D,ZOPH B,MANÉ D,et al.AutoAugment:Learning Augmentation Strategies From Data[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:113-123. [17]ZHANG H,CISSE M,DAUPHIN Y N,et al.mixup:BeyondEmpirical Risk Minimization[J].arXiv:1710.09412,2018. [18]YUN S,HAN D,OH S J,et al.CutMix:Regularization Strategy to Train Strong Classifiers With Localizable Features[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:6022-6031. [19]DEVRIES T,TAYLOR G W.Improved Regularization of Convolutional Neural Networks with Cutout[J].arXiv:1708.04552,2017. [20]QIN X,ZHANG Z,HUANG C,et al.BASNet:Boundary-Aware Salient Object Detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:7471-7481. [21]HSIEH K,PHANISHAYEE A,MUTLU O.The Non-IID Data Quagmire of Decentralized Machine Learning[C]//Proceedings of the 37th International Conference on Machine Learning.PMLR,2020:4387-4398. |
|