Computer Science ›› 2024, Vol. 51 ›› Issue (8): 63-74.doi: 10.11896/jsjkx.230600103
• Database & Big Data & Data Science • Previous Articles Next Articles
WANG Yiyang1, LIU Fagui1,2, PENG Lingxia1, ZHONG Guoxiang1
CLC Number:
[1]GHEMAWAT S,GOBIOFF H,LEUNG S T.The Google file system[C]//Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles.New York:Association for Computing Machinery,2003:29-43. [2]ZHANG H,TANG D,CAI H L.Study on Predictive Erasure Codes in Distributed Storage System[J].Computer Science,2021,48(5):130-139. [3]MURRAY J F,HUGHES G F,KREUTZ-DELGADO K,et al.Machine Learning Methods for Predicting Failures in Hard Drives:A Multiple-Instance Application[J].Journal of Machine Learning Research,2005,6(27):783-816. [4]TOMER V,SHARMA V,GUPTA S,et al.Hard disk drive fai-lure prediction using SMART attribute[J].Materials Today:Proceedings,2021,46(20):11258-11262. [5]GAO X,ZHA S,LI X,et al.Incremental Prediction Model of Disk Failures Based on the Density Metric of Edge Samples[J].IEEE Access,2019,7:114285-114296. [6]BACKBLAZE.BackblazeHard Drive Data and Stats[EB/OL].https://www.backblaze.com/b2/hard-drive-test-data.html. [7]ZHAO R,GUAN D,JIN Y,et al.Hard Disk Failure Prediction via Transfer Learning[C]//Big Data and Security.Singapore:Springer,2021:522-536. [8]WANG J,ZHANG R,QI G,et al.A Heuristic-IRM Method on Hard Disk Failure Prediction in Out-of-distribution Environments[C]//2021 IEEE International Conference on Industrial Engineering and Engineering Management.Singapore:IEEE,2021:1661-1664. [9]ZHANG J,HUANG P,ZHOU K,et al.Hddse:Enabling high-dimensional disk state embedding for generic failure detection system of heterogeneous disks in large data centers[C] //Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference.USA:USENIX Association,2020:111-126. [10]ZHAO N,ZHANG X F,ZHANG L J.Overview of Imbalanced Data Classification[J].Computer Science,2018,45(6A):22-27. [11]SMITH R W,DIETRICH D L.The bathtub curve:an alternative explanation[C]//Proceedings of Annual Reliability and Maintainability Symposium.USA:IEEE,1994:241-247. [12]SCHROEDER B,GIBSON G A.Understanding disk failurerates[J].ACM Transactions on Storage,2007,3(3):8. [13]ZHOU Y,WANG F,FENG D.ASLDP:An Active Semi-supervised Learning method for Disk Failure Prediction[C]//50th International Conference on Parallel Processing.New York:Association for Computing Machinery,2021:1-11. [14]ZHOU H,NIU Z,WANG G,et al.A Proactive Failure Tolerant Mechanismfor SSDs Storage Systems based on Unsupervised Learning[C]//2021 IEEE/ACM 29th International Symposium on Quality of Service.Tokyo:IEEE,2021:1-10. [15]ZHU B,WANG G,LIU X,et al.Proactive drive failure prediction for large scale storage systems[C]//2013 IEEE 29th Symposium on Mass Storage Systems and Technologies.Long Beach:IEEE,2013:1-5. [16]SUN X,CHAKRABARTY K,HUANG R,et al.System-level hardware failure prediction using deep learning[C]//2019 56th ACM/IEEE Design Automation Conference.Las Vegas:IEEE,2019:1-6. [17]BURRELLO A,PAGLIARI D J,BARTOLINI A,et al.Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks[C]//Euro-Par 2020:Parallel Proces-sing Workshops.Cham:Springer,2021:277-289. [18]ZÜFLE M,KRUPITZER C,ERHARD F,et al.To fail or not to fail:Predicting hard disk drive failure time windows[C]//Mea-surement,Modelling and Evaluation of Computing Systems.Cham:Springer,2020:19-36. [19]JIA J,WU P,ZHANG K,et al.Imbalanced Disk Failure Data Processing Method Based on CTGAN[C]//Intelligent Computing Theories and Application.Cham:Springer,2022:638-649. [20]SHEN J,WAN J,LIM S J,et al.Random-forest-based failure prediction for hard disk drives[J].International Journal of Distributed Sensor Networks,2018,14(11):1-15. [21]BOTEZATU M,GIURGIU I,BOGOJESKA J,et al.Predicting disk replacement towards reliable data centers[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.New York:Association for Computing Machinery,2016:39-48. [22]RINCÓN C A C,PARIS J F,VILALTA R,et al.Disk failure prediction in heterogeneous environments[C]//2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.Seattle:IEEE,2017:113-119. [23]XIAO J,YI Y,XIONG Z,et al.Disk failure prediction in data centers via online learning[C]//Proceedings of the 47th International Conference on Parallel Processing.New York:Association for Computing Machinery,2018:1-10. [24]XU Y,SUI K,YAO R,et al.Improving serviceavailability ofcloud systems by predicting disk error[C]//Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Confe-rence.USA:USENIX Association,2018:481-494. [25]LI J,JI X,JIA Y,et al.Hard drive failure prediction using classification and regression trees[C]//2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.Atlanta:IEEE,2014:383-394. [26]PEREIRA F L F,DOS SANTOS LIMA F D,DE MOURA LEITE L G,et al.Transfer learning for Bayesian networks with application on hard disk drives failure prediction[C]//2017 Brazi-lian Conference on Intelligent Systems.Uberlandia:IEEE,2017:228-233. [27]XIE Y,FENG D,WANG F,et al.OME:An Optimized Modeling Engine for Disk Failure Prediction in Heterogeneous Data Center[C]//2018 IEEE 36th International Conference on Computer Design.Orlando:IEEE,2019:561-564. [28]WANG J,LAN C,LIU C,et al.Generalizing to Unseen Do-mains:A Survey on Domain Generalization[J].IEEE Transactions on Knowledge and Data Engineering,2023,35(8):8052-8072. [29]FREY B J,DUECK D.Clustering by Passing Messages Between Data Points[J].Science,2007,315(5814):972-976. [30]CHEN C L P,LIU Z.Broad Learning System:An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(1):10-24. [31]PAO Y H,PARK G H,SOBAJIC D J.Learning and generalization characteristics of the random vector functional-link net[J].Neurocomputing,1994,6(2):163-180. [32]CAI X Y,FENG X,YU H Q.Adaptive Weight Based Broad Learning Algorithm for Cascaded Enhanced Nodes[J].Compu-ter Science,2022,49(6):134-141. [33]PENG C,CHUNHAO D.Monitoring multi-domain batchprocess state based on fuzzy broad learning system[J].Expert Systems with Applications,2022,187:115851. [34]LIU B,ZENG X,TIAN F,et al.Domain Transfer Broad Lear-ning System for Long-Term Drift Compensation in Electronic Nose Systems[J].IEEE Access,2019,7:143947-143959. [35]BELKIN M,NIYOGI P,SINDHWANI V.Manifold regularization:A geometric framework for learning from labeled and unlabeled examples[J].The Journal of Machine Learning Research,2006,7:2399-2434. [36]NG N,HULKUND N,CHO K,et al.Predicting Out-of-Domain Generalization with Local Manifold Smoothness[J].arXiv:2207.02093,2022. [37]LU W,WANG J,SUN X,et al.Out-of-distribution Representation Learning for Time Series Classification[C]//The Eleventh International Conference on Learning Representations.Kigali:OpenReview.net,2023:1-21. [38]PENG Y,XU J,ZHAO N.Noise Feature Selection Method in PAKDD 2020 Alibaba AI Ops Competition:Large-Scale Disk Failure Prediction[C]//Large-Scale Disk Failure Prediction.Singapore:Springer,2020:109-118. [39]CAHYADI,FORSHAW M.Hard Disk Failure Prediction onHighly Imbalanced Data using LSTMNetwork[C]//2021 IEEE International Conference on Big Data.Orlando:IEEE,2021:3985-3991. [40]PITAKRAT T,VAN HOORN A,GRUNSKE L.A comparison of machine learning algorithms for proactive hard disk drive fai-lure detection[C]//Proceedings of the 4th International ACM Sigsoft Symposium on Architecting Critical Systems.New York:Association for Computing Machinery,2013:1-10. [41]FRIEDMAN J H.Greedy function approximation:a gradientboosting machine[J].Annals of Statistics,2001,29(5):1189-1232. [42]CHEN T,GUESTRIN C.XGBoost:A scalable tree boostingsystem[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2016:785-794. [43]KE G,MENG Q,FINLEY T,et al.LightGBM:A Highly Efficient Gradient Boosting Decision Tree[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc,2017:30. |
[1] | LIANG Yunhui, GAN Jianwen, CHEN Yan, ZHOU Peng, DU Liang. Unsupervised Feature Selection Algorithm Based on Dual Manifold Re-ranking [J]. Computer Science, 2023, 50(7): 72-81. |
[2] | JIA Jingdong, ZHANG Minnan, ZHAO Xiang, HUANG Jian. Study on Scheduling Algorithm of Intelligent Order Dispatching [J]. Computer Science, 2023, 50(11A): 230300029-7. |
[3] | CHEN Chang-wei, ZHOU Xiao-feng. Fast Local Collaborative Representation Based Classifier and Its Applications in Face Recognition [J]. Computer Science, 2021, 48(9): 208-215. |
[4] | BAI Zi-yi, MAO Yi-rong , WANG Rui-ping. Survey on Video-based Face Recognition [J]. Computer Science, 2021, 48(3): 50-59. |
[5] | YANG Zhang-jing, WANG Wen-bo, HUANG Pu, ZHANG Fan-long, WANG Xin. Local Weighted Representation Based Linear Regression Classifier and Face Recognition [J]. Computer Science, 2021, 48(11A): 351-359. |
[6] | WANG Mao-guang, YANG Hang. Risk Control Model and Algorithm Based on AP-Entropy Selection Ensemble [J]. Computer Science, 2021, 48(11A): 71-76. |
[7] | HUAN Wen-ming, LIN Hai-tao. Design of Intrusion Detection System Based on Sampling Ensemble Algorithm [J]. Computer Science, 2021, 48(11A): 705-712. |
[8] | ZHANG Jun, WANG Yang, LI Kun-hao, LI Chang, ZHAO Chuan-xin. Multi-source Sensor Body Area Network Data Fusion Model Based on Manifold Learning [J]. Computer Science, 2020, 47(8): 323-328. |
[9] | FANG Meng-lin, TANG Wen-bing, HUANG Hong-yun and DING Zuo-hua. Wall-following Navigation of Mobile Robot Based on Fuzzy-based Information Decomposition and Control Rules [J]. Computer Science, 2020, 47(6A): 79-83. |
[10] | GU Xue-mei,LIU Jia-yong,CHENG Peng-sen,HE Xiang. Malware Name Recognition in Tweets Based on Enhanced BiLSTM-CRF Model [J]. Computer Science, 2020, 47(2): 245-250. |
[11] | DONG Ming-gang,JIANG Zhen-long,JING Chao. Multi-class Imbalanced Learning Algorithm Based on Hellinger Distance and SMOTE Algorithm [J]. Computer Science, 2020, 47(1): 102-109. |
[12] | LIU Hua-ling, LIN Bei, YUN Wen-jing, DING Yu-jie. Comparison of Balancing Methods in Internet Finance Overdue Recognition:Taking PPDai.com As Case [J]. Computer Science, 2019, 46(11A): 595-598. |
[13] | WANG Wei-hong, CHEN Xiao, WU Wei, GAO Xing-yu. Method of Automatically Extracting Urban Water Bodies from High-resolution Images with Complex Background [J]. Computer Science, 2019, 46(11): 277-283. |
[14] | LI Xiang-yuan, CAI Cheng, HE Jin-rong. Density Scaling Factor Based ISOMAP Algorithm [J]. Computer Science, 2018, 45(7): 207-213. |
[15] | WANG Zhong-min, ZHANG Shuang and HE Yan. Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Measurement Cluster [J]. Computer Science, 2018, 45(1): 307-312. |
|