计算机科学 ›› 2022, Vol. 49 ›› Issue (9): 101-110.doi: 10.11896/jsjkx.210600174

• 数据库&大数据&数据科学* 上一篇    下一篇

基于全变分比分隔距离的时序数据异常检测

徐天慧1, 郭强1, 张彩明2   

  1. 1 山东财经大学计算机科学与技术学院 济南 250014
    2 山东大学软件学院 济南 250014
  • 收稿日期:2021-06-22 修回日期:2021-10-15 出版日期:2022-09-15 发布日期:2022-09-09
  • 通讯作者: 郭强(guoqiang@sdufe.edu.cn)
  • 作者简介:(xutianhui06@qq.com)
  • 基金资助:
    国家自然科学基金(61873145,61802229);山东省自然科学省属高校优秀青年人才联合基金项目(ZR2017JL029);山东省高等学校青创科技支持计划(2019KJN045)

Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance

XU Tian-hui1, GUO Qiang1, ZHANG Cai-ming2   

  1. 1 School of Computer Science and Technology,Shandong University of Finance and Economics,Jinan 250014,China
    2 School of Software,Shandong University,Jinan 250014,China
  • Received:2021-06-22 Revised:2021-10-15 Online:2022-09-15 Published:2022-09-09
  • About author:XU Tian-hui,born in 1998,postgra-duate.Her main research interests include data analysis and anomaly detection.
    GUO Qiang,born in 1979,Ph.D,professor,is a member of China Computer Federation.His main research interests include computer vision and data mi-ning.
  • Supported by:
    National Natural Science Foundation of China(61873145,61802229),Natural Science Foundation of Shandong Province for Excellent Young Scholars(ZR2017JL029) and Science and Technology Innovation Program for Distinguished Young Scholars of Shandong Province Higher Education Institutions(2019KJN045).

摘要: 时序数据异常检测是数据分析的重要研究问题之一,其主要挑战在于利用数据点上下文准确判断数据是否存在异常,若存在异常则低时延定位该异常。现有检测方法通常利用概率密度比来度量序列间的相似性,以捕捉异常,这些方法需借助交叉验证法来估计概率密度比模型中的参数。交叉验证法会提高计算复杂度,导致计算效率较低,且存在较大检测时延。针对上述问题,提出了一种基于全变分比分隔距离的检测方法。该方法采用全变分提取序列波动特征,以此为基础计算全变分比分隔距离来度量序列间的相似性,从而提高计算效率,并实现低时延定位异常。针对噪声干扰问题,将检测方法与相对全变分相结合以增强检测方法的鲁棒性,从而进一步提高该方法的检测准确度。实验结果表明,该方法在检测准确度、低时延以及计算效率3个方面均取得了较好的效果。

关键词: 异常检测, 概率密度比, 时延, 全变分, 相对全变分

Abstract: Anomaly detection for time series data is one of the important research problems in data analysis.Its main challenge is to detect if there are any anomalies and locate anomalies with low delay according to context.Most of existing anomaly detection methods capture anomalies using the probability density ratio to measure similarity between sequences.These methods need to use the cross-validation method to estimate the parameters of probability density ratio.However,cross-validation can increase the computational complexity,resulting in low computational efficiency and a high time delay.To address these issues,this paper proposes a detection method based on total variation ratio separation distance,in which total variation is adopted to extract sequence fluctuation features.Due to the fact that the total variation ratio is better than probability density ratio,the proposed method achieves higher computational efficiency and lower time delay.To reduce noise interference and further improve the detection accuracy,the proposed method is combined with the relative total variation.Experimental results show that the proposed method performs well in terms of detection accuracy,low delay and computational efficiency.

Key words: Anomaly detection, Probability density ratio, Time delay, Total variation, Relative total variation

中图分类号: 

  • TP391
[1]LIU S,YAMADA M,COLLIER N,et al.Change-point detection in time-series data by relative density-ratio estimation[J].Neural Networks,2013,43(1):72-83.
[2]SIFFER A,FOUQUE P A,TERMIER A,et al.Anomaly detection in streams with extreme value theory[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.USA:ACM,2017:1067-1075.
[3]ZHANG Q,HU Y P,JI C,et al.Edge computing application:real-time anomaly detection algorithm for sensing data[J].Journal of Computer Research and Development,2018,55(3):524-536.
[4]GUO Y S,LIU M D.Anomaly detection based on spatial-temporal trajectory data[J].Computer Science,2021,48(6A):213-219.
[5]CAI R C,XIE W H,HAO Z F,et al.Abnormal crowd detection based on multi-scale recurrent neural network[J].Journal of Software,2015,26(11):2884-2896.
[6]CHANDOLA V,BANERJEE A,KUMAR V.Anomaly detec-tion:a survey[J].ACM Computing Surveys,2009,41(3):1-58.
[7]HODGE V,AUSTIN J.A survey of outlier detection methodo-logies[J].Artificial Intelligence Review,2004,22(2):85-126.
[8]SU W X,ZHU Y L,LIU F,et al.Outliers and change-points detection algorithm for time series[J].Journal of Computer Research and Development,2014,51(4):781-788.
[9]STEINWART I,HUSH D,SCOVEL C.A classification framework for anomaly detection[J].Journal of Machine Learning Research,2005,6(1):211-232.
[10]ESKIN E,ARNOLD A,PRERAU M,et al.A geometric framework for unsupervised anomaly detection:Detecting intrusions in unlabeled data[M]//Applications of Data Mining in Compu-ter Security.Boston:Kluwer Academic Publishers,2002:78-99.
[11]HAN D M,GUO F Z,PAN J C,et al.Visual analysis for anomaly detection in time-Series:a survey[J].Journal of Computer Research and Development,2018,55(9):1843-1852.
[12]MOSKVINA V,ZHIGLJAVSKY A.An algorithm based on singular spectrum analysis for change-point detection[J].Communications in Statistics-Simulation and Computation,2003,32(2):319-352.
[13]CHEN J,OUYANG J Y,FENG A Q,et al.DoS anomaly detection based on isolation forest algorithm under edge computing framework[J].Computer Science,2020,47(2):287-293.
[14]KEOGH E,CHU S,HART D,et al.An online algorithm for segmenting time series [C]//Proceedings 2001 IEEE International Conference on Data Mining.USA:IEEE,2001:289-296.
[15]DING X O,YU S J,WANG M X,et al.Anomaly detection on industrial time series based on correlation analysis[J].Journal of Software,2020,31(3):726-747.
[16]FENG A R,WANG X R,WANG Q Y,et al.Database anomaly access detection based on principal component analysis and random tree[J].Computer Science,2020,47(9):94-98.
[17]BORGWARDT K M,GRETTON A,RASCH M J,et al.Integrating structured biological data by kernel maximum mean discrepancy[J].Bioinformatics,2006,22(14):e49-e57.
[18]SUGIYAMA M,SUZUKI T,NAKAJIMA S,et al.Directimportance estimation for covariate shift adaptation[J].Annals of the Institute of Statistical Mathematics,2008,60(4):699-746.
[19]JIANG H,ZHANG H F,LUO Y D,at al.Adaptive threshold network traffic anomaly detection based on KL distance[J].Computer Engineering,2019,45(4):108-113.
[20]KANAMORI T,HIDO S,SUGIYAMA M.A least-squares approach to direct importance estimation[J].The Journal of Machine Learning Research,2009,10(1):1391-1445.
[21]AMINIKHANGHAHI S,WANG T,COOK D J.Real-timechange point detection with application to smart home time series data[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(5):1010-1023.
[22]GIBBS A L,SU F E.On choosing and bounding probability me-trics[J].International Statistical Review,2002,70(3):419-435.
[23]LI S,XIE Y,DAI H,et al.M-statistic for kernel change-point detection[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.USA:MIT Press,2015:3366-3374.
[24]KANAMORI T,SUZUKI T,SUGIYAMA M.Computationalcomplexity of kernel-based density-ratio estimation:A condition number analysis[J].Machine Learning,2013,90(3):431-460.
[25]XU L,YAN Q,XIA Y,et al.Structure extraction from texture via relative total variation[J].ACM Transactions on Graphics,2012,31(6):1-10.
[26]DING Z P,ZHANG S W,CHEN J Z,et al.Structure-preserving image smoothing with L0 gradient minimization couplinggra-dient fidelity term[J].Scientia Sinica Informationis,2014, 44(11):1370-1384.
[27]WANG Y Y,CHEN S C.A survey of evaluation and design for AUC based classifier[J].Pattern Recognition and Artificial Intelligence,2011,24(1):64-71.
[1] 李其烨, 邢红杰.
基于最大相关熵的KPCA异常检测方法
KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion
计算机科学, 2022, 49(8): 267-272. https://doi.org/10.11896/jsjkx.210700175
[2] 王馨彤, 王璇, 孙知信.
基于多尺度记忆残差网络的网络流量异常检测模型
Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network
计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[3] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[4] 方韬, 杨旸, 陈佳馨.
D2D辅助移动边缘计算下的卸载策略优化
Optimization of Offloading Decisions in D2D-assisted MEC Networks
计算机科学, 2022, 49(6A): 601-605. https://doi.org/10.11896/jsjkx.210200114
[5] 胥昊, 曹桂均, 闫璐, 李科, 王振宏.
面向铁路集装箱的高可靠低时延无线资源分配算法
Wireless Resource Allocation Algorithm with High Reliability and Low Delay for Railway Container
计算机科学, 2022, 49(6): 39-43. https://doi.org/10.11896/jsjkx.211200143
[6] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[7] 马力文, 周颖.
改善STARTUP阶段空窗现象的BBR单边适应算法
BBR Unilateral Adaptation Algorithm for Improving Empty Window Phenomenon in STARTUP Phase
计算机科学, 2022, 49(2): 321-328. https://doi.org/10.11896/jsjkx.201200266
[8] 冷佳旭, 谭明圮, 胡波, 高新波.
基于隐式视角转换的视频异常检测
Video Anomaly Detection Based on Implicit View Transformation
计算机科学, 2022, 49(2): 142-148. https://doi.org/10.11896/jsjkx.210900266
[9] 刘意, 毛莺池, 程杨堃, 高建, 王龙宝.
基于邻域一致性的异常检测序列集成方法
Locality and Consistency Based Sequential Ensemble Method for Outlier Detection
计算机科学, 2022, 49(1): 146-152. https://doi.org/10.11896/jsjkx.201000156
[10] 张叶, 李志华, 王长杰.
基于核密度估计的轻量级物联网异常流量检测方法
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
[11] 陶星朋, 徐宏辉, 郑建炜, 陈婉君.
基于非凸低秩矩阵逼近和全变分正则化的高光谱图像去噪
Hyperspectral Image Denoising Based on Nonconvex Low Rank Matrix Approximation and TotalVariation Regularization
计算机科学, 2021, 48(8): 125-133. https://doi.org/10.11896/jsjkx.200400143
[12] 郭奕杉, 刘漫丹.
基于时空轨迹数据的异常检测
Anomaly Detection Based on Spatial-temporal Trajectory Data
计算机科学, 2021, 48(6A): 213-219. https://doi.org/10.11896/jsjkx.201100193
[13] 邢红杰, 郝忠.
基于全局和局部判别对抗自编码器的异常检测方法
Novelty Detection Method Based on Global and Local Discriminative Adversarial Autoencoder
计算机科学, 2021, 48(6): 202-209. https://doi.org/10.11896/jsjkx.200400083
[14] 管文华, 林春雨, 杨尚蓉, 刘美琴, 赵耀.
基于人体关节点的低头异常行人检测
Detection of Head-bowing Abnormal Pedestrians Based on Human Joint Points
计算机科学, 2021, 48(5): 163-169. https://doi.org/10.11896/jsjkx.200800214
[15] 刘立成, 徐一凡, 谢贵才, 段磊.
面向NoSQL数据库的JSON文档异常检测与语义消歧模型
Outlier Detection and Semantic Disambiguation of JSON Document for NoSQL Database
计算机科学, 2021, 48(2): 93-99. https://doi.org/10.11896/jsjkx.200900039
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!