Computer Science ›› 2026, Vol. 53 ›› Issue (5): 388-403.doi: 10.11896/jsjkx.250300131

• Computer Architecture • Previous Articles     Next Articles

Novel Multi-task Federated Learning Based Approach for Detecting and Diagnosing Anomalies inCloud Microservices

CHEN Peng1, HAO Junfeng1, XIA Yunni2, LI Xi1   

  1. 1 School of Computer and Software Engineering, Xihua University, Chengdu 610039, China
    2 College of Computer, Chongqing University, Chongqing 400044, China
  • Received:2025-03-24 Revised:2025-06-24 Published:2026-05-08
  • About author:CHEN Peng,born in 1979,Ph.D,professor,is a executive member of CCF(No.B3144M).His main research interests include cloud computing,service computing and anomaly detection.
    XIA Yunni,born in 1980,Ph.D,professor,doctoral supervisor,is a member of CCF(No.23641M).His main research interests include cloud computing,ser-vice computing and edge computing.
  • Supported by:
    National Natural Science Foundation of China(62172062),Sichuan Provincial Natural Science Foundation(2024NSFTD0008) and Science and Technology Program of Sichuan Province(2020JDRC0067,2023JDRC0087).

Abstract: Microservice architecture is widely used for application development in cloud environments,and its essence is to build applications through a series of functionally independent small autonomous services with high cohesion,high availability,low coupling,and good scalability.However,since microservice architecture is a distributed computing architecture with high dynamics and real-time system anomaly detection of distributed and independent microservices is a very challenging task,determining the category of detected anomalies is even more critical in practical applications.To solve the above problems,a multi-task federated learning-based system anomaly detection and diagnosis(MT-FL-SADD) method is proposed.Firstly,a multi-task federated lear-ning(MT-FL) distributed learning framework is proposed,which is used to construct an anomaly detection and diagnosis model for each microservice.Secondly,in order to identify the complex system anomaly patterns and features at the runtime of microservices,a feature extractor based on squeeze excitetion and external attention(SE-EA-EDN) is constructed to efficiently extract the features of real-time data from microservices monitoring at the runtime.Finally,a local-global feature-based parallel knowledge transfer(LGF-PKT) framework is designed to parallelize the weight update of local and global features.To validate the effectiveness of the proposed method,MT-FL-SADD improves the average Macro F1 by 33.9% and the average Micro F1 by 33.4% compared to other federated learning methods on the microservices benchmarking platforms Sock Shop and Train Ticket,and also improves the average F1 by 2.2% compared to other federated learning methods on SWaT,SMD and SKAB.

Key words: Microservice architecture, Multi-task federated learning, Distributed computing architectures, System anomaly detection and diagnosis

CLC Number: 

  • TP181
[1]NICOLA D,SAVERIO G,ALBERTO L,et al.Microservices:yesterday,today,and tomorrow[M]//Present and Ulterior Software Engineering,2017:195-216.
[2]LU W,JIANG Y,LI Q S,et al.A Review of Research on Microservice Fault Detection[J].Chinese Journal of Computers,2023,46(11):2342-2369.
[3]ZEINA H,DANIEL B T,EDDY C,et al.Enhancing microser-vices architectures using data-driven service discovery and QoS guarantees[C]//Proceedings of the 2020 20th IEEE/ACM International Symposium on Cluster,Cloud and Internet Computing(CCGRID).IEEE,2020:290-299.
[4]ZENG Z H,LI C Y,LIAO Q.Multivariate Time Series Anomaly Detection Algorithm in Missing Value Scenario[J].Computer Science,2024,51(7):108-115.
[5]JAVAD G,DANIEL L.Challenges of Microservices Architecture:A Survey on the State of the Practice[C]//ZEUS.2018:1-8.
[6]ZHANG C,XIE Y,BAI H,et al.A survey on federated learning[J].Knowledge-Based Systems,2021,216:106775.
[7]LIU Y X,CHEN H,LIU Y H,et al.Privacy-preserving Techniques in Federated Learning[J].Journal of Software,2022,33(3):1057-1092.
[8]MCMAHAN B,MOORE E,RAMAGE D,et al.Communication-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282.
[9]ZHANG Y,LIU J,ZUO X.Multi-task learning[J].ChineseJournal of Computing,2020,43(7):1340-1378.
[10]SMITH V,CHIANG C K,SANJABI M,et al.Federated multi-task learning[C]//Proceedings of the Advances in Neural Information Processing Systems(NeurIPS).2017.
[11]ZHANG Y,YANG Q.A survey on multi-task learning[J].IEEE Transactions on Knowledge and Data Engineering,2021,34(12):5586-5609.
[12]VIACHESLAV K,IURII K,DMITRY L.Online forecastingand anomaly detection based on the ARIMA model[J].Applied Sciences,2021,11(7):3194.
[13]CHALAPATHY R,MENON A K,CHAWLA S.Anomaly detection using one-class neural networks[J].arXiv:1802.06360,2018.
[14]TAKEISHI N.Shapley values of reconstruction errors of pca for explaining anomaly detection[C]//Proceedings of the 2019 International Conference on Data Mining Workshops(ICDM Workshops).IEEE,2019:793-798.
[15]CHEN Y,ZHAO Q,LU L.Combining the outputs of various k-nearest neighbor anomaly detectors to form a robust ensemble model for high-dimensional geochemical anomaly detection[J].Journal of Geochemical Exploration,2021,231:106875.
[16]PAPER D.Scikit-learn classifier tuning from complex trainingsets[C]//Hands-on Scikit-Learn for Machine Learning Applications:Data Science Fundamentals with Python.2020:165-188.
[17]XU H,CHEN W,ZHAO N,et al.Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications[C]//Proceedings of the 2018 World Wide Web Confe-rence(WWW).2018:187-196.
[18]RUFF L,VANDERMEULEN R,GOERNITZ N,et al.Deepone-class classification[C]//Proceedings of the International Conference on Machine Learning(ICML).PMLR,2018:4393-4402.
[19]SONG Y J,XIN R Y,CHEN P,et al.Identifying performanceanomalies in fluctuating cloud environments:A robust correlative-GNN-based explainable approach[J].Future Generation Computer Systems,2023,145:77-86.
[20]CHEN P,LIU H Y,XIN R Y,et al.Effectively detecting operational anomalies in large-scale iot data infrastructures by using a gan-based predictive model[J].The Computer Journal,2022,65(11):2909-2925.
[21]AUDIBERT J,MICHIARDI P,GUYARD F,et al.Usad:Unsupervised anomaly detection on multivariate time series[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(SIGKDD).2020:3395-3404.
[22]TULI S,CASALE G,JENNINGS N R.TranAD:Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data[C]//VLDB.2022:1201-1214.
[23]DENG A,HOOI B.Graph neural network-based anomaly detection in multivariate time series[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2021:4027-4035.
[24]ZHOU X,WU J,LIANG W,et al.Reconstructed graph neural network with knowledge distillation for lightweight anomaly detection[J].IEEE Transactions on Neural Networks and Lear-ning Systems,2024,35(9):11817-11828.
[25]GUO H,ZHOU Z,ZHAO D,et al.EGNN:Energy-efficientanomaly detection for IoT multivariate time series data using graph neural network[J].Future Generation Computer Systems,2024:151:45-56.
[26]YANG X,ZHAO X,SHEN Z.A Generalizable Anomaly Detection Method in Dynamic Graphs[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2025:22001-22009.
[27]HUANG X,CHEN W,HU B,et al.Graph mixture of expertsand memory-augmented routers for multivariate time series anomaly detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2025:17476-17484.
[28]ZHANG C X,SONG D J,CHEN Y C,et al.A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.AAAI,2019:1409-1416.
[29]XIN R Y,LIU H Y,CHEN P,et al.Robust and accurate performance anomaly detection and prediction for cloud applications:a novel ensemble learning-based framework[J].Journal of Cloud Computing,2023,12(1):1-16.
[30]GUO J Y,LI R H,ZHANG Y,et al.Graph neural networkbased anomaly detection in dynamic networks[J].Journal of Software,2020,31(3):748-762.
[31]CHEN X,GE C,WANG M,et al.Supervised contrastive few-shot learning for high-frequency time series[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2023:7069-7077.
[32]LIU Y,HU T,ZHANG H,et al.iTransformer:Inverted Transformers Are Effective for Time Series Forecasting[C]//The Twelfth International Conference on Learning Representations(ICLR).2024.
[33]SATER R A,HAMZA A B.A federated learning approach toanomaly detection in smart buildings[J].ACM Transactions on Internet of Things,2021,2(4):1-23.
[34]NGUYEN T D,MARCHAL S,MIETTINEN M,et al.DÏoT:A federated self-learning anomaly detection system for IoT[C]//2019 IEEE 39th International Conference on Distributed Computing Systems(ICDCS).IEEE,2019:756-767.
[35]LI S,CHENG Y,LIU Y,et al.Abnormal client behavior detection in federated learning[J].arXiv:1910.09933,2019.
[36]YUROCHKIN M,AGARWAL M,GHOSH S,et al.Bayesian nonparametric federated learning of neural networks[C]//Proceedings of the International Conference on Machine Learning(ICML).PMLR,2019:7252-7261.
[37]CHEN Y,NING Y,CHAI Z,et al.Federated multi-task hierarchical attention model for sensor analytics[J].arXiv:1905.05142,2019.
[38]QU Z,LIN K,LI Z,et al.Federated learning’s blessing:Fedavg has linear speedup[C]//Proceedings of the ICLR 2021-Workshop on Distributed and Private Machine Learning(DPML).2021.
[39]LI X,HUANG K X,YANG W H,et al.On the convergence of fedavg on non-IID data[J].arXiv:1907.02189,2019.
[40]YANG H W,HE H,ZHANG W C,et al.FEDSTEG:A federated transfer learning framework for secure image steganalysis[J].IEEE Transactions on Network Science and Engineering,2020,8(2):1084-1094.
[41]LIU Y,KANG Y,XING C P,et al.A secure federated transfer learning framework[J].IEEE Intelligent Systems,2020,35(4):70-82.
[42]ZHU Z,HONG J,ZHOU J.Data-free knowledge distillation for heterogeneous federated learning[C]//Proceedings of the International Conference on Machine Learning(ICML).PMLR,2021:12878-12889.
[43]LONG G,XIE M,SHEN T,et al.Multi-center federated lear-ning:clients clustering for better personalization[C]//World Wide Web 26.2023:481-500.
[44]HAO J,CHEN P,CHEN J,et al.Effectively detecting and diagnosing distributed multivariate time series anomalies via Unsupervised Federated Hypernetwork[J].Information Processing &Management,2025,62(4):104107.
[45]GHOSH A,HONG J,YIN D,et al.Robust federated learning in a heterogeneous environment[J].arXiv:1906.06629,2019.
[46]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:7132-7141.
[47]GUO M H,LIU Z N,MU T J,et al.Beyond self-attention:External attention using two linear layers for visual tasks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(5):5436-5447.
[48]MARIANI L,MONNI C,PEZZÉ M,et al.Localizing faults in cloud systems[C]//Proceedings of the 2018 IEEE 11th International Conference on Software Testing,Verification and Validation(ICST).IEEE,2018:262-273.
[1] JIANG Zheng, WANG Jun-li, CAO Rui-hao, YAN Chun-gang. Method of Service Decomposition Based on Microservice Architecture [J]. Computer Science, 2021, 48(12): 17-23.
[2] WU Wen-jun, YU Xin, PU Yan-jun, WANG Qun-bo, YU Xiao-ming. Development of Complex Service Software in Microservice Era [J]. Computer Science, 2020, 47(12): 11-17.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!