Computer Science ›› 2023, Vol. 50 ›› Issue (2): 23-31.doi: 10.11896/jsjkx.221100133

• Edge Intelligent Collaboration Technology and Frontier Applications • Previous Articles     Next Articles

Hierarchical Memory Pool Based Edge Semi-supervised Continual Learning Method

WANG Xiangwei, HAN Rui, Chi Harold LIU   

  1. School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China
  • Received:2022-11-15 Revised:2023-01-16 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    National Natural Science Foundation of China(62132019,62272046,61872337)

Abstract: The continuous changes of the external environment lead to the performance regression of neural networksbased on traditional deep learning methods.Therefore,continual learning(CL) area gradually attracts the attention of more researchers.For edge intelligence,the CL model not only needs to overcome catastrophic forgetting,but also needs to face the huge challenge of severely limited resources.This challenge is mainly reflected in the lack of labeled resources and powerful devices.However,the existing classic CL methods usually rely on a large number of labeled samples to maintain the plasticity and stability,and the lack of labeled resources will lead to a significant accuracy drop.Meanwhile,in order to deal with the problem of insufficient annotation resources,semi-supervised learning methods often need to pay a large computational and memory overhead for higher accuracy.In response to these problems,a low-cost semi-supervised CL method named edge hierarchicalmemory learner (EdgeHML) is proposed.EdgeHML can effectively utilize a large number of unlabeled samples and a small number of labeled samples.It is based on a hierarchical memory pool,leverage multi-level storage structure to store and replay samples.EdgeHML implements the interaction between different levels through a combination of online and offline strategies.In addition,in order to further reduce the computational overhead for unlabeled samples,EdgeHML leverages a progressive learning method.It reduces the computation cycles of unlabeled samples by controlling the learning process.Experimental results show that on three semi-supervised CL tasks,EdgeHML can improve the model accuracy by up to 16.35% compared with the classic CL method,and the training iterations time can be reduced by more than 50% compared with semi-supervised methods.EdgeHML achieves a semi-supervised CL process with high performance and low overhead for edge intelligence.

Key words: Edge intelligence, Continual learning, Semi-supervised learning, Data labeling, Deep neural network

CLC Number: 

  • TP301
[1]ZHOU Z,CHEN X,LI E,et al.Edge Intelligence:Paving the Last Mile of Artificial Intelligence With Edge Computing[C]//Proceedings of the IEEE.2019:1738-1762.
[2]HAN R,LI D,OUYANG J,et al.Accurate Differentially Private Deep Learning on the Edge[J].IEEE Transactions on Parallel and Distributed Systems,2021,32(9):2231-2247.
[3]HAN R,LI S,WANG X,et al.Accelerating Gossip-Based Deep Learning in Heterogeneous Edge Computing Platforms[J].IEEE Transactions on Parallel and Distributed Systems,2021,32(7):1591-1602.
[4]YUAN Q,ZHOU H,LI J,et al.Toward Efficient Content Delivery for Automated Driving Services:An Edge Computing Solution[J].IEEE Network,2018,32(1):80-86.
[5]KIRKPATRICK J,PASCANU R,RABINOWITZ N,et al.Overcoming catastrophic forgetting in neural networks[J].Proceedings of the national academy of sciences,2017,114(13):3521-3526.
[6]HAN R,LIU C H,LI S,et al.Accelerating Deep Learning Sys-tems via Critical Set Identification and Model Compression[J].IEEE Transactions on Computers,2020,69(7):1059-1070.
[7]HAN R,ZHANG Q,LIU C H,et al.LegoDNN:block-grainedscaling of deep neural networks for mobile vision[C]//Procee-dings of the 27th Annual International Conference on Mobile Computing and Networking.New York,NY,USA:Association for Computing Machinery,2021:406-419.
[8]RUSU A A,RABINOWITZ N C,DESJARDINS G,et al.Progressive neural networks[J].arXiv:1606.04671,2016.
[9]ZENKE F,POOLE B,GANGULI S.Continual LearningThrough Synaptic Intelligence[C]//Proceedings of the 34th International Conference on Machine Learning.PMLR,2017:3987-3995.
[10]BUZZEGA P,BOSCHINI M,PORRELLO A,et al.Dark Experience for General Continual Learning:a Strong,Simple Baseline[C]//Advances in Neural Information Processing Systems.2020:15920-15930.
[11]WANG L,YANG K,LI C,et al.ORDisCo:Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2021:5383-5392.
[12]XIE Q,DAI Z,HOVY E,et al.Unsupervised Data Augmentation for Consistency Training[C]//Advances in Neural Information Processing Systems:Vol.33.Curran Associates,2020:6256-6268.
[13]GOUK H,HOSPEDALES T M,PONTIL M.Distance-BasedRegularisation of Deep Networks for Fine-Tuning[J].arXiv:2002.08253,2020.
[14]SHI Y,YUAN L,CHEN Y,et al.Continual Learning via Bit-Level Information Preserving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:16674-16683.
[15]JUNG S,AHN H,CHA S,et al.Continual Learning with Node-Importance based Adaptive Group Sparse Regularization[C]//Advances in Neural Information Processing Systems.2020:3647-3658.
[16]SINGH P,MAZUMDER P,RAI P,et al.Rectification-BasedKnowledge Retention for Continual Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:15282-15291.
[17]KAPOOR S,KARALETSOS T,BUI T D.Variational Auto-Regressive Gaussian Processes for Continual Learning[C]//Proceedings of the 38th International Conference on Machine Learning.PMLR,2021:5290-5300.
[18]RATCLIFF R.Connectionist models of recognition memory:constraints imposed by learning and forgetting functions[J].Psychological review,1990,97(2):285.
[19]SAHA G,GARG I,ROY K.Gradient projection memory for continual learning[J].arXiv:2103.09762,2021.
[20]SHIM D,MAI Z,JEONG J,et al.Online Class-Incremental Continual Learning with Adversarial Shapley Value[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:9630-9638.
[21]ALJUNDI R,CACCIA L,BELILOVSKY E,et al.Online Continual Learning with Maximally Interfered Retrieval[J].arXiv:1908.04742,2019.
[22]ALJUNDI R,LIN M,GOUJAUD B,et al.Gradient based sample selection for online continual learning[J].arXiv:1903.08671,2019.
[23]BORSOS Z,MUTNY M,KRAUSE A.Coresets via Bilevel Optimization for Continual Learning and Streaming[C]//Advances in Neural Information Processing Systems.2020:14879-14890.
[24]BANG J,KIM H,YOO Y,et al.Rainbow Memory:ContinualLearning With a Memory of Diverse Samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8218-8227.
[25]LEE S,HA J,ZHANG D,et al.A neural dirichlet process mixture model for task-free continual learning[J].arXiv:2001.00689,2020.
[26]HU W,QIN Q,WANG M,et al.Continual Learning by Using Information of Each Class Holistically[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:7797-7805.
[27]VERMA V K,LIANG K J,MEHTA N,et al.Efficient Feature Transformations for Discriminative and Generative Continual Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13865-13875.
[28]DERAKHSHANI M M,ZHEN X,SHAO L,et al.Kernel Continual Learning[C]//Proceedings of the 38th International Conference on Machine Learning.PMLR,2021:2621-2631.
[29]JOSEPH K J,BALASUBRAMANIAN V N.Meta-Consolida-tion for Continual Learning[C]//Advances in Neural Information Processing Systems.2020:14374-14386.
[30]WANG S,LI X,SUN J,et al.Training Networks in Null Space of Feature Covariance for Continual Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:184-193.
[31]DAVARI M,ASADI N,MUDUR S,et al.Probing Representation Forgetting in Supervised and Unsupervised Continual Learning[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:16691-16700.
[32]MADAAN D,YOON J,LI Y,et al.Representational Continuity for Unsupervised Continual Learning[C]//International Confe-rence on Learning Representations.2022.
[33]WU Z,XIONG Y,YU S X,et al.Unsupervised Feature Lear-ning via Non-Parametric Instance Discrimination[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3733-3742.
[34]JAVED K,WHITE M.Meta-Learning Representations for Continual Learning[C]//Advances in Neural Information Proces-sing Systems:Vol.32.Curran Associates,2019.
[35]LI K H.Reservoir-sampling algorithms of time complexity O(n(1+log(N/n)))[J].ACM Transactions on Mathematical Software (TOMS),1994,20(4):481-493.
[36]SOHN K,BERTHELOT D,CARLINI N,et al.Fixmatch:Simplifying semi-supervised learning with consistency and confidence[C]//Advances in Neural Information Processing Systems.2020:596-608.
[37]KRIZHEVSKY A,HINTON G.Learning multiple layers offeatures from tiny images[J/OL].https://typeset.io/papers/learning-multiple-layers-of-features-from-tiny-images-5eum9uf4g8.
[38]LE Y,YANG X.Tiny imagenet visual recognition challenge[J/OL].http://vision.stanford.edu/teaching/cs231n/reports/2015/pdfs/yle_project.pdf.
[39]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scalehierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.2009:248-255.
[40]DE LANGE M,ALJUNDI R,MASANA M,et al.A Continual Learning Survey:Defying Forgetting in Classification Tasks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(7):3366-3385.
[41]ZHANG B,WANG Y,HOU W,et al.FlexMatch:BoostingSemi-Supervised Learning with Curriculum Pseudo Labeling[C]//Advances in Neural Information Processing Systems:Vol.34.Curran Associates,2021:18408-18419.
[42]HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[1] RAO Dan, SHI Hongwei. Study on Air Traffic Flow Recognition and Anomaly Detection Based on Deep Clustering [J]. Computer Science, 2023, 50(3): 121-128.
[2] LI Haitao, WANG Ruimin, DONG Weiyu, JIANG Liehui. Semi-supervised Network Traffic Anomaly Detection Method Based on GRU [J]. Computer Science, 2023, 50(3): 380-390.
[3] Yifei ZOU, Senmao QI, Cong'an XU, Dongxiao YU. Distributed Weighted Data Aggregation Algorithm in End-to-Edge Communication Networks Based on Multi-armed Bandit [J]. Computer Science, 2023, 50(2): 13-22.
[4] LI Xiaohuan, CHEN Bitao, KANG Jiawen, YE Jin. Coalition Game-assisted Joint Resource Optimization for Digital Twin-assisted Edge Intelligence [J]. Computer Science, 2023, 50(2): 42-49.
[5] LIU Xing-guang, ZHOU Li, LIU Yan, ZHANG Xiao-ying, TAN Xiang, WEI Ji-bo. Construction and Distribution Method of REM Based on Edge Intelligence [J]. Computer Science, 2022, 49(9): 236-241.
[6] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[7] HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118.
[8] WEI Hui, CHEN Ze-mao, ZHANG Li-qiang. Anomaly Detection Framework of System Call Trace Based on Sequence and Frequency Patterns [J]. Computer Science, 2022, 49(6): 350-355.
[9] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[10] JIAO Xiang, WEI Xiang-lin, XUE Yu, WANG Chao, DUAN Qiang. Automatic Modulation Recognition Based on Deep Learning [J]. Computer Science, 2022, 49(5): 266-278.
[11] GAO Jie, LIU Sha, HUANG Ze-qiang, ZHENG Tian-yu, LIU Xin, QI Feng-bin. Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor [J]. Computer Science, 2022, 49(5): 355-362.
[12] XU Hua-jie, CHEN Yu, YANG Yang, QIN Yuan-zhuo. Semi-supervised Learning Method Based on Automated Mixed Sample Data Augmentation Techniques [J]. Computer Science, 2022, 49(3): 288-293.
[13] XIE Yu, YANG Rui-ling, LIU Gong-xu, LI De-yu, WANG Wen-jian. Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph [J]. Computer Science, 2022, 49(2): 62-68.
[14] QIAN Dong-wei, CUI Yang-guang, WEI Tong-quan. Secondary Modeling of Pollutant Concentration Prediction Based on Deep Neural Networks with Federal Learning [J]. Computer Science, 2022, 49(11A): 211200084-5.
[15] ZHAO Hong, CHANG You-kang, WANG Wei-jie. Survey of Adversarial Attacks and Defense Methods for Deep Neural Networks [J]. Computer Science, 2022, 49(11A): 210900163-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!