Computer Science ›› 2024, Vol. 51 ›› Issue (8): 63-74.doi: 10.11896/jsjkx.230600103

• Database & Big Data & Data Science • Previous Articles     Next Articles

Out-of-Distribution Hard Disk Failure Prediction with Affinity Propagation Clustering and Broad Learning Systems

WANG Yiyang1, LIU Fagui1,2, PENG Lingxia1, ZHONG Guoxiang1   

  1. 1 School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,China
    2 Peng Cheng Laboratory,Shenzhen,Guangdong 518066,China
  • Received:2023-06-12 Revised:2023-11-28 Online:2024-08-15 Published:2024-08-13
  • About author:WANG Yiyang,born in 1999,postgra-duate.His main research interest is cloud computing.
    ZHONG Guoxiang,born in 1994,Ph.D.His main research interests include cloud computing,AIOps and machine learning.
  • Supported by:
    Major Key Project of PCL,China(PCL2023A09),Guangdong Major Project of Basic and Applied Basic Research(2019B030302002),Science and Technology Major Project of Guangzhou(202007030006) and Science and Technology Project of Guangdong Province(2021B1111600001).

Abstract: Hard disk is the primary storage device in cloud data centers,and hard disk failure prediction is crucial for ensuring data security.However,there is a significant imbalance between failure and healthy SMART samples,which can lead to model bias.Moreover,hard disk models exhibit varying data distributions.Prediction models trained on specific hard disk data may not be suitable for other hard disks.To address these issues,this paper proposes a method for out-of-distribution hard disk failure prediction by combining the AP clustering algorithm and the broad learning system.To tackle the sample imbalance problem,this paper uses the AP clustering algorithm to cluster samples close to failure and treats all samples in the cluster containing determined failure instances as additional failure samples.To address the distribution differences of hard disk models,this paper combines the manifold regularization framework and the broad learning system to learn the low-dimensional structure of hard disk data,thereby improving the model’s generalization ability to unknown data.Experimental results show that,on the dataset resampled by the AP clustering algorithm,the F1_Score of multiple methods increases by an average of 0.2 compared to the datasets resampled by comparative methods.Additionally,in the task of predicting out-of-distribution hard disk failures,the F1_Score of the proposed model increases by 0.1~0.2 compared to other methods.

Key words: Hard disk failure prediction, Class imbalance, Out-of-distribution generalization, Affinity propagation clustering, Broad learning system, Manifold learning

CLC Number: 

  • TP302
[1]GHEMAWAT S,GOBIOFF H,LEUNG S T.The Google file system[C]//Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles.New York:Association for Computing Machinery,2003:29-43.
[2]ZHANG H,TANG D,CAI H L.Study on Predictive Erasure Codes in Distributed Storage System[J].Computer Science,2021,48(5):130-139.
[3]MURRAY J F,HUGHES G F,KREUTZ-DELGADO K,et al.Machine Learning Methods for Predicting Failures in Hard Drives:A Multiple-Instance Application[J].Journal of Machine Learning Research,2005,6(27):783-816.
[4]TOMER V,SHARMA V,GUPTA S,et al.Hard disk drive fai-lure prediction using SMART attribute[J].Materials Today:Proceedings,2021,46(20):11258-11262.
[5]GAO X,ZHA S,LI X,et al.Incremental Prediction Model of Disk Failures Based on the Density Metric of Edge Samples[J].IEEE Access,2019,7:114285-114296.
[6]BACKBLAZE.BackblazeHard Drive Data and Stats[EB/OL].https://www.backblaze.com/b2/hard-drive-test-data.html.
[7]ZHAO R,GUAN D,JIN Y,et al.Hard Disk Failure Prediction via Transfer Learning[C]//Big Data and Security.Singapore:Springer,2021:522-536.
[8]WANG J,ZHANG R,QI G,et al.A Heuristic-IRM Method on Hard Disk Failure Prediction in Out-of-distribution Environments[C]//2021 IEEE International Conference on Industrial Engineering and Engineering Management.Singapore:IEEE,2021:1661-1664.
[9]ZHANG J,HUANG P,ZHOU K,et al.Hddse:Enabling high-dimensional disk state embedding for generic failure detection system of heterogeneous disks in large data centers[C] //Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference.USA:USENIX Association,2020:111-126.
[10]ZHAO N,ZHANG X F,ZHANG L J.Overview of Imbalanced Data Classification[J].Computer Science,2018,45(6A):22-27.
[11]SMITH R W,DIETRICH D L.The bathtub curve:an alternative explanation[C]//Proceedings of Annual Reliability and Maintainability Symposium.USA:IEEE,1994:241-247.
[12]SCHROEDER B,GIBSON G A.Understanding disk failurerates[J].ACM Transactions on Storage,2007,3(3):8.
[13]ZHOU Y,WANG F,FENG D.ASLDP:An Active Semi-supervised Learning method for Disk Failure Prediction[C]//50th International Conference on Parallel Processing.New York:Association for Computing Machinery,2021:1-11.
[14]ZHOU H,NIU Z,WANG G,et al.A Proactive Failure Tolerant Mechanismfor SSDs Storage Systems based on Unsupervised Learning[C]//2021 IEEE/ACM 29th International Symposium on Quality of Service.Tokyo:IEEE,2021:1-10.
[15]ZHU B,WANG G,LIU X,et al.Proactive drive failure prediction for large scale storage systems[C]//2013 IEEE 29th Symposium on Mass Storage Systems and Technologies.Long Beach:IEEE,2013:1-5.
[16]SUN X,CHAKRABARTY K,HUANG R,et al.System-level hardware failure prediction using deep learning[C]//2019 56th ACM/IEEE Design Automation Conference.Las Vegas:IEEE,2019:1-6.
[17]BURRELLO A,PAGLIARI D J,BARTOLINI A,et al.Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks[C]//Euro-Par 2020:Parallel Proces-sing Workshops.Cham:Springer,2021:277-289.
[18]ZÜFLE M,KRUPITZER C,ERHARD F,et al.To fail or not to fail:Predicting hard disk drive failure time windows[C]//Mea-surement,Modelling and Evaluation of Computing Systems.Cham:Springer,2020:19-36.
[19]JIA J,WU P,ZHANG K,et al.Imbalanced Disk Failure Data Processing Method Based on CTGAN[C]//Intelligent Computing Theories and Application.Cham:Springer,2022:638-649.
[20]SHEN J,WAN J,LIM S J,et al.Random-forest-based failure prediction for hard disk drives[J].International Journal of Distributed Sensor Networks,2018,14(11):1-15.
[21]BOTEZATU M,GIURGIU I,BOGOJESKA J,et al.Predicting disk replacement towards reliable data centers[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.New York:Association for Computing Machinery,2016:39-48.
[22]RINCÓN C A C,PARIS J F,VILALTA R,et al.Disk failure prediction in heterogeneous environments[C]//2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.Seattle:IEEE,2017:113-119.
[23]XIAO J,YI Y,XIONG Z,et al.Disk failure prediction in data centers via online learning[C]//Proceedings of the 47th International Conference on Parallel Processing.New York:Association for Computing Machinery,2018:1-10.
[24]XU Y,SUI K,YAO R,et al.Improving serviceavailability ofcloud systems by predicting disk error[C]//Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Confe-rence.USA:USENIX Association,2018:481-494.
[25]LI J,JI X,JIA Y,et al.Hard drive failure prediction using classification and regression trees[C]//2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.Atlanta:IEEE,2014:383-394.
[26]PEREIRA F L F,DOS SANTOS LIMA F D,DE MOURA LEITE L G,et al.Transfer learning for Bayesian networks with application on hard disk drives failure prediction[C]//2017 Brazi-lian Conference on Intelligent Systems.Uberlandia:IEEE,2017:228-233.
[27]XIE Y,FENG D,WANG F,et al.OME:An Optimized Modeling Engine for Disk Failure Prediction in Heterogeneous Data Center[C]//2018 IEEE 36th International Conference on Computer Design.Orlando:IEEE,2019:561-564.
[28]WANG J,LAN C,LIU C,et al.Generalizing to Unseen Do-mains:A Survey on Domain Generalization[J].IEEE Transactions on Knowledge and Data Engineering,2023,35(8):8052-8072.
[29]FREY B J,DUECK D.Clustering by Passing Messages Between Data Points[J].Science,2007,315(5814):972-976.
[30]CHEN C L P,LIU Z.Broad Learning System:An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(1):10-24.
[31]PAO Y H,PARK G H,SOBAJIC D J.Learning and generalization characteristics of the random vector functional-link net[J].Neurocomputing,1994,6(2):163-180.
[32]CAI X Y,FENG X,YU H Q.Adaptive Weight Based Broad Learning Algorithm for Cascaded Enhanced Nodes[J].Compu-ter Science,2022,49(6):134-141.
[33]PENG C,CHUNHAO D.Monitoring multi-domain batchprocess state based on fuzzy broad learning system[J].Expert Systems with Applications,2022,187:115851.
[34]LIU B,ZENG X,TIAN F,et al.Domain Transfer Broad Lear-ning System for Long-Term Drift Compensation in Electronic Nose Systems[J].IEEE Access,2019,7:143947-143959.
[35]BELKIN M,NIYOGI P,SINDHWANI V.Manifold regularization:A geometric framework for learning from labeled and unlabeled examples[J].The Journal of Machine Learning Research,2006,7:2399-2434.
[36]NG N,HULKUND N,CHO K,et al.Predicting Out-of-Domain Generalization with Local Manifold Smoothness[J].arXiv:2207.02093,2022.
[37]LU W,WANG J,SUN X,et al.Out-of-distribution Representation Learning for Time Series Classification[C]//The Eleventh International Conference on Learning Representations.Kigali:OpenReview.net,2023:1-21.
[38]PENG Y,XU J,ZHAO N.Noise Feature Selection Method in PAKDD 2020 Alibaba AI Ops Competition:Large-Scale Disk Failure Prediction[C]//Large-Scale Disk Failure Prediction.Singapore:Springer,2020:109-118.
[39]CAHYADI,FORSHAW M.Hard Disk Failure Prediction onHighly Imbalanced Data using LSTMNetwork[C]//2021 IEEE International Conference on Big Data.Orlando:IEEE,2021:3985-3991.
[40]PITAKRAT T,VAN HOORN A,GRUNSKE L.A comparison of machine learning algorithms for proactive hard disk drive fai-lure detection[C]//Proceedings of the 4th International ACM Sigsoft Symposium on Architecting Critical Systems.New York:Association for Computing Machinery,2013:1-10.
[41]FRIEDMAN J H.Greedy function approximation:a gradientboosting machine[J].Annals of Statistics,2001,29(5):1189-1232.
[42]CHEN T,GUESTRIN C.XGBoost:A scalable tree boostingsystem[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2016:785-794.
[43]KE G,MENG Q,FINLEY T,et al.LightGBM:A Highly Efficient Gradient Boosting Decision Tree[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc,2017:30.
[1] LIANG Yunhui, GAN Jianwen, CHEN Yan, ZHOU Peng, DU Liang. Unsupervised Feature Selection Algorithm Based on Dual Manifold Re-ranking [J]. Computer Science, 2023, 50(7): 72-81.
[2] JIA Jingdong, ZHANG Minnan, ZHAO Xiang, HUANG Jian. Study on Scheduling Algorithm of Intelligent Order Dispatching [J]. Computer Science, 2023, 50(11A): 230300029-7.
[3] CHEN Chang-wei, ZHOU Xiao-feng. Fast Local Collaborative Representation Based Classifier and Its Applications in Face Recognition [J]. Computer Science, 2021, 48(9): 208-215.
[4] BAI Zi-yi, MAO Yi-rong , WANG Rui-ping. Survey on Video-based Face Recognition [J]. Computer Science, 2021, 48(3): 50-59.
[5] YANG Zhang-jing, WANG Wen-bo, HUANG Pu, ZHANG Fan-long, WANG Xin. Local Weighted Representation Based Linear Regression Classifier and Face Recognition [J]. Computer Science, 2021, 48(11A): 351-359.
[6] WANG Mao-guang, YANG Hang. Risk Control Model and Algorithm Based on AP-Entropy Selection Ensemble [J]. Computer Science, 2021, 48(11A): 71-76.
[7] HUAN Wen-ming, LIN Hai-tao. Design of Intrusion Detection System Based on Sampling Ensemble Algorithm [J]. Computer Science, 2021, 48(11A): 705-712.
[8] ZHANG Jun, WANG Yang, LI Kun-hao, LI Chang, ZHAO Chuan-xin. Multi-source Sensor Body Area Network Data Fusion Model Based on Manifold Learning [J]. Computer Science, 2020, 47(8): 323-328.
[9] FANG Meng-lin, TANG Wen-bing, HUANG Hong-yun and DING Zuo-hua. Wall-following Navigation of Mobile Robot Based on Fuzzy-based Information Decomposition and Control Rules [J]. Computer Science, 2020, 47(6A): 79-83.
[10] GU Xue-mei,LIU Jia-yong,CHENG Peng-sen,HE Xiang. Malware Name Recognition in Tweets Based on Enhanced BiLSTM-CRF Model [J]. Computer Science, 2020, 47(2): 245-250.
[11] DONG Ming-gang,JIANG Zhen-long,JING Chao. Multi-class Imbalanced Learning Algorithm Based on Hellinger Distance and SMOTE Algorithm [J]. Computer Science, 2020, 47(1): 102-109.
[12] LIU Hua-ling, LIN Bei, YUN Wen-jing, DING Yu-jie. Comparison of Balancing Methods in Internet Finance Overdue Recognition:Taking PPDai.com As Case [J]. Computer Science, 2019, 46(11A): 595-598.
[13] WANG Wei-hong, CHEN Xiao, WU Wei, GAO Xing-yu. Method of Automatically Extracting Urban Water Bodies from High-resolution Images with Complex Background [J]. Computer Science, 2019, 46(11): 277-283.
[14] LI Xiang-yuan, CAI Cheng, HE Jin-rong. Density Scaling Factor Based ISOMAP Algorithm [J]. Computer Science, 2018, 45(7): 207-213.
[15] WANG Zhong-min, ZHANG Shuang and HE Yan. Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Measurement Cluster [J]. Computer Science, 2018, 45(1): 307-312.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!