计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 267-272.doi: 10.11896/jsjkx.210700175
李其烨, 邢红杰
LI Qi-ye, XING Hong-jie
摘要: 异常检测是机器学习中一个重要的研究内容,目前已存在大量的异常检测方法。作为一种常用的核方法,核主成分分析(Kernel Principal Component Analysis,KPCA)已被成功地用于解决异常检测问题。然而,传统的KPCA异常检测方法对噪声非常敏感,若训练样本中存在噪声,则会降低KPCA异常检测方法的检测性能。为了提高 KPCA异常检测方法的抗噪声能力,提出了一种基于最大相关熵(Maximum Correntropy Criterion,MCC)的KPCA异常检测方法。利用信息理论学习中的相关熵代替KPCA异常检测方法中基于2范数的度量,通过调节相关熵函数中的宽度参数,可以有效抑制噪声带来的不利影响;利用半二次优化技术对所提方法的优化问题进行求解,仅需较少的迭代次数即可取得局部最优解。此外,给出了所提方法的算法描述,并分析了算法的计算复杂度。在16个UCI基准数据集上的实验结果表明,与其他4种相关方法相比,所提方法取得了更优的抗噪声能力和泛化性能。
中图分类号:
[1]TAX D M J.One-class classification:concept learning in the absence of counter examples[D].Delft:Delf University of Technology,2001. [2]PENNY K I,JOLLIFFE I T.A comparison of multivariate outlier detection methods for clinical laboratory safety data[J].The Statistician,2001,50(3):295-307. [3]OH C K,SOHN H,BAE I H.Statistical novelty detection within the Yeongjong suspension bridge under environmental and operational variations[J].Smart Materials and Structures,2009,18(12):5022-5029. [4]SCHÖLKOPF B,WILLIAMSON R C,SMOLA A J.Support vector method for novelty detection[C]//Advances in Neural Information Processing Systems.2000:582-588. [5]TAX D M J,DUIN R P W.Support vector data description[J].Machine Learning,2004,54(1):45-66. [6]SCHÖLKOPF B,SMOLA A,MÜLLER K R.Nonlinear component analysis as a kernel eigenvalue problem[J].Neural Computation,1998,10(5):1299-1319. [7]JOLLIFFE I T.Principal Component Analysis[M].Berlin:Springer-Verlag,2005. [8]TEIXEIRA A R,TOMÉ A M,STADLTHANNER K,et al.KPCA denosing and the pre-image problem revisited[J].Digital Signal Processing,2008,18(4):568-580. [9]LIAN H.On feature selection with principal component analysis for one-class SVM[J].Pattern Recognition Letters,2012,33(9):1027-1031. [10]HILL J,CORONA E,AO J,et al.Information Theoretic Clustering for Medical Image Segmentation[M].Berlin:Springer-Verlag,2014. [11]DEBRUYNE M,VERDONCK T.Robust kernel principal component and classification[J].Advances in Data Analysis and Classification,2010,4(2):151-167. [12]KIM C,KLABIAN D.A simple and fast algorithm for L1-norm kernel PCA[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(8):1842-1855. [13]DUAN X,QI P,TIAN Z.Registration for variform object of remote-sensing image using improved robust weighted kernel principal component analysis[J].Journal of The Indian Society of Remote Sensing,2016,44(5):675-686. [14]FAN J,CHOW T W S.Exactly robust kernel principal component analysis[J].IEEE Transactions on Neural Networks and Learning Systems,2020,31(3):749-761. [15]HOFFMANN H.Kernel PCA for novelty detection[J].Pattern Recognition,2007,40(3):863-874. [16]DUDA R O,HART P E,STORK D G.Pattern Classification.2nd Ed.[M].New York:Wiley Press,2001. [17]XIAO Y,WANG H,XU W,et al.L1 norm based KPCA for novelty detection[J].Pattern Recognition,2013,46(1):389-396. [18]ALZATE C,SUYKES J.Kernel component analysis using anepsilon-insensitive robust loss function[J].IEEE Transactions on Neural Networks,2008,19(9):1583-1598. [19]WANG D,TANAKA T.Robust kernel principal componentanalysis with l2,1-regularized loss minimization[J].IEEE Access,2020,8(81):864-875. [20]PRINCIPE J C.Information Theoretic Learning:Renyi’s Entropy and Kernel Perspectives[M].New York:Springer,2010. [21]LIU W,POKHAREL P P,PRINCIPE J C.Correntropy:properties and applications in non-Gaussian signal processing[J].IEEE Transactions on Signal Processing,2007,55(11):5286-5298. [22]HE R,HU B,ZHENG W,et al.Robust principal componentanalysis based on maximum correntropy criterion[J].IEEE Transactions on Image Processing,2011,20(6):1485-1494. [23]YUAN X,HU B.Robust feature extraction via information theoretic learning[C]//International Conference on Machine Learning,Montreal.2009:1193-1200. [24]KWAK N.Principal component analysis based on L1-norm maxi-mization[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(9):1672-1680. [25]VAPNIK V N.The Nature of Statistical Learning Theory[M].New York:Springer,2000. [26]ZHOU Z.Machine Learning[M].Beijing:Tsinghua University Press,2016. [27]GÜLER O.Convex Analysis[M].New York:Springer,2010. [28]SUN Q,ZHANG H,WANG X,et al.Sparsity constrained recursive generalized maximum correntropy criterion with variable center algorithm[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2020,67(12):3517-3521. [29]GAUTAM C,BALAJI R,SUDHARSAN K,et al.Localizedmultiple kernel learning for anomaly detection:one-class classification[J].Knowledge Based Systems,2019,165:241-252. [30]LICHMAN M.UCI Machine Learning Repository[EB/OL].University of California,Irvine,School of Information and Computer Sciences,2019.http://archive.ics.uci.edu/ml. [31]WU M,YE J.A small sphere and large margin approach for novelty detection using training data with outliers[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(11):2088-2092. [32]DENG H,XU R.Model selection for anomaly detection in wireless ad hoc networks[C]//2007 IEEE Symposium on Computational Intelligence and Data Mining.2007:540-546. [33]WANG S,YU J,LAPIRA E,et al.A modified support vector data description based novelty detection approach for machinery components[J].Applied Soft Computing,2013,13(2):1193-1205. [34]XIAO Y,WANG H,XU W.Parameter selection of Gaussiankernel for one-class SVM[J].IEEE Transactions on Cyberne-tics,2015,45:941-953. [35]SILVERMAN B W.Density Estimation for Statistics and Data Analysis[M].London:Chapman and Hall,1986. [36]LI Y,WANG Y,WANG Y,et al.Quantum clustering using kernel entropy component analysis[J].Neurocomputing,2016,202:36-48. |
[1] | 徐天慧, 郭强, 张彩明. 基于全变分比分隔距离的时序数据异常检测 Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance 计算机科学, 2022, 49(9): 101-110. https://doi.org/10.11896/jsjkx.210600174 |
[2] | 王馨彤, 王璇, 孙知信. 基于多尺度记忆残差网络的网络流量异常检测模型 Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network 计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011 |
[3] | 杜航原, 李铎, 王文剑. 一种面向电商网络的异常用户检测方法 Method for Abnormal Users Detection Oriented to E-commerce Network 计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092 |
[4] | 阙华坤, 冯小峰, 刘盼龙, 郭文翀, 李健, 曾伟良, 范竞敏. Grassberger熵随机森林在窃电行为检测的应用 Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection 计算机科学, 2022, 49(6A): 790-794. https://doi.org/10.11896/jsjkx.210800032 |
[5] | 武玉坤, 李伟, 倪敏雅, 许志骋. 单类支持向量机融合深度自编码器的异常检测模型 Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder 计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142 |
[6] | 冷佳旭, 谭明圮, 胡波, 高新波. 基于隐式视角转换的视频异常检测 Video Anomaly Detection Based on Implicit View Transformation 计算机科学, 2022, 49(2): 142-148. https://doi.org/10.11896/jsjkx.210900266 |
[7] | 刘意, 毛莺池, 程杨堃, 高建, 王龙宝. 基于邻域一致性的异常检测序列集成方法 Locality and Consistency Based Sequential Ensemble Method for Outlier Detection 计算机科学, 2022, 49(1): 146-152. https://doi.org/10.11896/jsjkx.201000156 |
[8] | 张叶, 李志华, 王长杰. 基于核密度估计的轻量级物联网异常流量检测方法 Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method 计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108 |
[9] | 郭奕杉, 刘漫丹. 基于时空轨迹数据的异常检测 Anomaly Detection Based on Spatial-temporal Trajectory Data 计算机科学, 2021, 48(6A): 213-219. https://doi.org/10.11896/jsjkx.201100193 |
[10] | 邢红杰, 郝忠. 基于全局和局部判别对抗自编码器的异常检测方法 Novelty Detection Method Based on Global and Local Discriminative Adversarial Autoencoder 计算机科学, 2021, 48(6): 202-209. https://doi.org/10.11896/jsjkx.200400083 |
[11] | 管文华, 林春雨, 杨尚蓉, 刘美琴, 赵耀. 基于人体关节点的低头异常行人检测 Detection of Head-bowing Abnormal Pedestrians Based on Human Joint Points 计算机科学, 2021, 48(5): 163-169. https://doi.org/10.11896/jsjkx.200800214 |
[12] | 林云, 黄桢航, 高凡. 扩散式变阶数最大相关熵准则算法 Diffusion Variable Tap-length Maximum Correntropy Criterion Algorithm 计算机科学, 2021, 48(5): 263-269. https://doi.org/10.11896/jsjkx.200300043 |
[13] | 刘立成, 徐一凡, 谢贵才, 段磊. 面向NoSQL数据库的JSON文档异常检测与语义消歧模型 Outlier Detection and Semantic Disambiguation of JSON Document for NoSQL Database 计算机科学, 2021, 48(2): 93-99. https://doi.org/10.11896/jsjkx.200900039 |
[14] | 邹承明, 陈德. 高维大数据分析的无监督异常检测方法 Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis 计算机科学, 2021, 48(2): 121-127. https://doi.org/10.11896/jsjkx.191100141 |
[15] | 石琳姗, 马创, 杨云, 靳敏. 基于SSC-BP神经网络的异常检测算法 Anomaly Detection Algorithm Based on SSC-BP Neural Network 计算机科学, 2021, 48(12): 357-363. https://doi.org/10.11896/jsjkx.201000086 |
|