计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 440-443.doi: 10.11896/JsJkx.190600173

• 数据库 & 大数据 & 数据科学 • 上一篇    下一篇

一种基于时序性告警的新型聚类算法

邓甜甜1, 2, 熊荫乔1, 2, 何贤浩2   

  1. 1 长沙学院电子信息与电气工程学院 长沙 410022;
    2 国防科技大学计算机学院 长沙 410073
  • 发布日期:2020-07-07
  • 通讯作者: 熊荫乔(yq.xiong@ccsu.edu.cn)
  • 作者简介:dtt@ccsu.edu.cn
  • 基金资助:
    国家自然科学基金(61972058);湖南省自然科学基金(2020JJ5621);长沙市科技计划项目(ZD1601042,K1705031)

Novel Clustering Algorithm Based on Timing-featured Alarms

DENG Tian-tian1, 2, XIONG Yin-qiao1, 2 and HE Xian-hao2   

  1. 1 College of Electronic and Communication Engineering,Changsha University,Changsha 410022,China
    2 College of Computer,National University of Defense and Technology,Changsha 410073,China
  • Published:2020-07-07
  • About author:DENG Tian-tian, Ph.D, senior engineer. Her research interests include big data analysis and open source ecology.
    XIONG Yin-qiao, Ph.D.His research interests include privacy preserving, information security, and the Internet of Things.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(61972058),Natural Science Foundation of Hunan Province(2020JJ5621) and Science and Technology Planning ProJect of Changsha (ZD1601042,K1705031).

摘要: 云环境下,大规模集群设备将产生海量时序性的告警数据,实际应用中,运维人员通常利用这些告警数据来定位、排查、修复故障和错误,维持系统的正常运行。因此,如何将海量告警数据进行有效聚类,并挖掘告警中的关键信息,必将成为“云”能否持续稳定运行的核心问题。据此,文中提出了一种基于时序性告警的新型聚类算法。算法利用设定时间窗口内两两告警之间时间差的关系,构造告警之间新的关系矩阵,再利用K-means算法对关系矩阵中的列向量进行聚类,得到告警的聚类结果。实验结果表明,该算法能充分地将海量告警信息有效聚类。

关键词: 告警, 聚类, 时序特征, 数据挖掘

Abstract: In the cloud environment,large-scale cluster equipments will generate massive timing-featured alarms.In the practical application,operational personnel generally uses these alarms to locate,check and repair the faults and errors,and maintains the normal operation of the systems.So how to efficiently cluster the alarms and mine the key information will be core issues to keep continuous and stable operation of the cloud.Therefore,this paper proposes a novel clustering algorithm based on timing featured alarms.The algorithm constructs a new relation matrix by utilizing time difference between any two alarms in the given time window,then takes advantage of K-means algorithm to cluster the column vectors in the relation matrix,to get the cluster result of alarms.Experiment result shows that the algorithm can cluster massive alarms efficiently.

Key words: Alarms, Cluster, Data mining, Timing feature

中图分类号: 

  • TP274
[1] KICIMAN E,FOX A.Detecting and localizing anomalous behavior to discover failures in component-based internet services .Technical Report,Stanford,2004.
[2] 王肇刚.基于网络拓扑约束的时序数据挖掘算法研究与应用.北京:北京邮电大学,2009.
[3] HAN J W,KAMBER M.数据挖掘概念与技术(原书第2版)(计算机科学丛书).北京:机械工业出版社,2008.
[4] AGRAWAI R.Mining association rules between sets of items in large databases//Proceedings of the 1993 ACM SIGMOD Conference.Washington,D C,1993:207-216.
[5] HAN J,PEI J,YIN Y.Mining frequent patterns without candidate generation//ACM SIGMOD International Conference on Management of Data.ACM,2000:1-12.
[6] HATONEN K.Knowledge discovery from telecommunication network alarm databases//ICDE 96.New Orieans,1996:115-122.
[7] NING P,CUI Y,REEVES D S,et al.Techniques and tools for analyzing intrusion alerts.ACM Transactions on Information and System Security(TISSEC),2004,7(2):274-318.
[8] 刘冬生,曾小荟,唐卫东,等.一种新的告警关联聚类算法.计算机应用研究,2013,30(12):3786-3789,3793.
[9] 陈兴蜀,何涛,曾雪梅, 等.基于告警属性聚类的攻击场景关联规则挖掘方法研究.工程科学与技术,2019,51(3):144-150.
[10] 樊迪,刘静,庄俊玺, 等.基于因果知识发现的攻击场景重构研究.网络与信息安全学报,2017,3(4):58-68.
[11] 冯学伟,王东霞,黄敏桓, 等.一种基于马尔可夫性质的因果知识挖掘方法.计算机研究与发展,2014,51(11):2493-2504.
[12] KHOSRAVI-FARMAD M,RAMAKI A A,BAFGHI A G.Risk-based Intrusion Response Management in IDS using Bayesian Decision Networks//2015 5th International Conference on Computer and Kknowledge Engineering(ICCKE).2015:307-312.
[13] RAMAKI A A,RASOOLZADEGAN A ,BAFGHI A G.A Systematic Mapping Study on Intrusion Alert Analysis in Intrusion Detection Systems.ACM Computing Surveys,2018,51(3):55.
[1] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[2] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[3] 黎嵘繁, 钟婷, 吴劲, 周帆, 匡平.
基于时空注意力克里金的边坡形变数据插值方法
Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation
计算机科学, 2022, 49(8): 33-39. https://doi.org/10.11896/jsjkx.210600161
[4] 刘丽, 李仁发.
医疗CPS协作网络控制策略优化
Control Strategy Optimization of Medical CPS Cooperative Network
计算机科学, 2022, 49(6A): 39-43. https://doi.org/10.11896/jsjkx.210300230
[5] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于DBSCAN聚类的集群联邦学习方法
Clustered Federated Learning Methods Based on DBSCAN Clustering
计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059
[6] 郁舒昊, 周辉, 叶春杨, 王太正.
SDFA:基于多特征融合的船舶轨迹聚类方法研究
SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion
计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253
[7] 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞.
基于密度敏感距离和模糊划分的改进FCM算法
FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition
计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042
[8] 陈景年.
一种适于多分类问题的支持向量机加速方法
Acceleration of SVM for Multi-class Classification
计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149
[9] 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳.
三维城市场景中的小物体检测
Small Object Detection in 3D Urban Scenes
计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[10] 邢云冰, 龙广玉, 胡春雨, 忽丽莎.
基于SVM的类别增量人体活动识别方法
Human Activity Recognition Method Based on Class Increment SVM
计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024
[11] 朱哲清, 耿海军, 钱宇华.
面向化学结构的线段聚类算法
Line-Segment Clustering Algorithm for Chemical Structure
计算机科学, 2022, 49(5): 113-119. https://doi.org/10.11896/jsjkx.210700131
[12] 张宇姣, 黄锐, 张福泉, 隋栋, 张虎.
基于菌群优化的近邻传播聚类算法研究
Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization
计算机科学, 2022, 49(5): 165-169. https://doi.org/10.11896/jsjkx.210800218
[13] 么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明.
大数据驱动的社会经济地位分析研究综述
Big Data-driven Based Socioeconomic Status Analysis:A Survey
计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014
[14] 左园林, 龚月姣, 陈伟能.
成本受限条件下的社交网络影响最大化方法
Budget-aware Influence Maximization in Social Networks
计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228
[15] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!