计算机科学 ›› 2020, Vol. 47 ›› Issue (9): 99-104.doi: 10.11896/jsjkx.200600170

• 数据库&大数据&数据科学 • 上一篇    下一篇

高阶多视图离群点检测

钟颖宇, 陈松灿   

  1. 南京航空航天大学计算机科学与技术学院 南京211106
  • 收稿日期:2020-06-28 发布日期:2020-09-10
  • 通讯作者: 陈松灿(s.chen@nuaa.edu.cn)
  • 作者简介:zhongyingyu@nuaa.edu.cn
  • 基金资助:
    国家自然科学基金重点项目(61732006)

High-order Multi-view Outlier Detection

ZHONG Ying-yu, CHEN Song-can   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2020-06-28 Published:2020-09-10
  • About author:ZHONG Ying-yu,born in 1995,postgraduate.His main research interests include multi-view learning and anomaly detection.
    CHEN Song-can,born in 1962,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include pattern recognition,machine learning and neural computing.
  • Supported by:
    Key Program of National Natural Science Foundation of China (61732006).

摘要: 由于数据在不同视图之间的分布比较复杂,传统的单视图离群点检测方法不再适用于多视图离群点的检测,使得多视图离群点检测成为一个颇具挑战性的研究课题。多视图离群点可分为3种类型:属性离群点、类离群点和类-属性离群点。现有方法采用跨视图成对约束来学习新的特征表示,并根据这些特征来定义离群点评分度量。这些方法没有充分利用视图间的交互信息,并且在面对3个或更多视图时会导致计算的复杂度更高。为此,文中考虑将多视图数据重塑成张量集形式,定义高阶多视图离群点,并且证明现有的三类多视图离群点都满足高阶多视图离群点的定义,从而提出一种新的多视图离群点检测算法——高阶多视图离群点检测算法(High-Order Multi-View Outlier Detection,HOMVOD)。该算法首先将多视图数据重塑成张量集形式,然后学习其低秩表示,最后设计张量表示下的离群值函数来实现检测。在UCI 数据集上的实验表明,HOMVOD算法在检测多视图离群点方面优于现有方法。

关键词: 低秩表示, 多视图离群点检测, 多视图学习, 异常检测, 张量表示

Abstract: Due to the complex distribution of data between different views,the traditional single-view outlier detection method is no longer applicable to the detection of multi-view outliers,making multi-view outlier detection a challenging research topic.Multi-view outliers can be divided into three types:attribute outliers,class outliers,and class-attribute outliers.Existing methods use pairwise constraints across views to learn new feature representations and define outlier scoring metrics based on these features,which do not take full advantage of the interactive information between views and results in higher computational complexity when facing three or more views.Therefore,this paper considers to reshape multi-view data into tensor set form,defines high-order multi-view outliers,and proves that all of the existing three types of multi-view outliers meet the definition of high-order multi-view outliers,so as to propose a new multi-view outliers detection algorithm called high-order multi-view outliers detection algorithm (HOMVOD).Specifically,the algorithm firstly reshapes multi-view data into tensor set form,then learns its low-rank representation,and finally designs outlier function under tensor representation to realize detection.Experiments on UCI datasets show that this method is superior to existing methods in detecting multi-view outliers.

Key words: Anomaly detection, Low-rank representation, Multi-view learning, Multi-view outlier detection, Tensor representation

中图分类号: 

  • TP181
[1] WEST J,BHATTACHARYA M.Intelligent financial fraud detection:a comprehensive review [J].Computers & Security,2016,57:47-66.
[2] BAHNSEN A C,AOUADA D,STOJANOVIC A,et al.Feature engineering strategies for credit card fraud detection [J].Expert Systems with Applications,2016,51:134-142.
[3] HUANG S Y,LIN C C,CHIU A A,et al.Fraud detection usingfraud triangle risk factors[J].Information Systems Frontiers,2017,19(6):1343-1356.
[4] SHUAIB M,OSHO O,ISMAILA I,et al.Comparative analysis of classification algorithms for email spam detection [J].International Journal of Computer Network and Information Security,2018,10(1):60.
[5] COLUCCIA A,DÁLCONZO A,RICCIATO F.Distribution-based anomaly detection via generalized likelihood ratio test:A general maximum entropy approach [J].Computer Networks,2013,57(17):3446-3462.
[6] VU N H,GOPALKRISHNAN V,ASSENT I.An UnbiasedDistance-Based Outlier Detection Approach for High-Dimensional Data[C]//Database Systems for Advanced Applications - 16th International Conference(DASFAA 2011).Hong Kong,China,2011.
[7] YU H,WANG B,XIAO G,et al.Distance-based outlier detection on uncertain data [J].Journal of Computer Research & Development,2010,1(3):293-298.
[8] RADOVANOVIC M,NANOPOULOS A,IVANOVIC M.Re-verse nearest neighbors in unsupervised distance-based outlier detection [J].IEEE Transactions on Knowledge & Data Engineering,2015,27(5):1369-1382.
[9] ZHANG Z,ZHU M,QIU J,et al.Outlier detection based on cluster outlier factor and mutual density[J].International Journal of Intelligent Information and Database Systems,2019,12(1/2):91-108.
[10] TANG B,HE H.A local density-based approach for outlier detection [J].Neurocomputing,2017,241:171-180.
[11] MISHRA G,AGARWAL S,JAIN P K,et al.Outlier detection using subset formation of clustering-based method[C]//International Conference on Advanced Computing Networking and Informatics.Singapore:Springer,2019:521-528.
[12] AZHAR F.Fuzzy clustering-based semi-supervised approachfor outlier detection in big text data [J].Progress in Artificial Intelligence,2019,8(1):123-132.
[13] LI X,CHEN S.A Concise yet Effective model for Non-Aligned Incomplete Multi-view and Missing Multi-Label Learning [J].arXiv:2005.00976,2020.
[14] HU M,CHEN S.Doubly aligned incomplete multi-view clustering [J].arXiv:1903.02785,2019.
[15] WANG Z,XU J,CHEN S,et al.Regularized multi-view learning machine based on response surface technique [J].Neurocompu-ting,2012,97:201-213.
[16] QIAN Q,CHEN S,ZHOU X.Multi-view classification withcross-view must-link and cannot-link side information [J].Knowledge-Based Systems,2013,54:137-146.
[17] GAO J,FAN W,TURAGA D,et al.A spectral framework for detecting inconsistency across multi-source objects relationships[C]//2011 IEEE 11-th International Conference on Data Mi-ning.IEEE,2011:1050-1055.
[18] MARCOS A A,YAMADA M,KIMURA A,et al.Clustering-based anomaly detection in multi-view data[C]//Proceedings of the 22nd ACM international conference on Information & Knowledge Management.2013:1545-1548.
[19] LI S,SHAO M,FU Y.Multi-view low-rank analysis for outlier detection[C]//Proceedings of the 2015 SIAM International Conference on Data Mining.SIAM,2015:748-756.
[20] ZHAO H,FU Y.Dual-regularized multi-view outlier detection[C]//Twenty-Fourth International Joint Conference on Artificial Intelligence.2015.
[21] LI K,LI S,DING Z,et al.Latent discriminant subspace representations for multi-view outlier detection[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[22] LUO Y,TAO D,RAMAMOHANARAO K,et al.Tensor canonical correlation analysis for multi-view dimension reduction [J].IEEE transactions on Knowledge and Data Engineering,2015,27(11):3111-3124.
[23] CAO B,HE L,KONG X,et al.Tensor-based Multiview feature selection with applications to brain diseases[C]//2014 IEEE International Conference on Data Mining.IEEE,2014:40-49.
[24] BU F.A high-order clustering algorithm based on dropout deep learning for heterogeneous data in cyber-physical-social systems [J].IEEE Access,2017,6:11687-11693.
[25] LI C G,VIDAL R.Structured sparse subspace clustering:A unified optimization framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:277-286.
[26] LIN Z,CHEN M,MA Y.The augmented Lagrange multipliermethod for exact recovery of corrupted low-rank matrices [J].arXiv:1009.5055,2010.
[27] CAI J F,CANDÈS E J,SHEN Z.A singular value thresholding algorithm for matrix completion [J].SIAM Journal on optimization,2010,20(4):1956-1982.
[28] LIU G,LIN Z,YU Y.Robust subspace segmentation by low-rank representation[C]//Proceedings of the 27th International Conference on Machine Learning (ICML-10).DBLP,2010:663-670.
[29] LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):171-184.
[1] 徐天慧, 郭强, 张彩明.
基于全变分比分隔距离的时序数据异常检测
Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance
计算机科学, 2022, 49(9): 101-110. https://doi.org/10.11896/jsjkx.210600174
[2] 李其烨, 邢红杰.
基于最大相关熵的KPCA异常检测方法
KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion
计算机科学, 2022, 49(8): 267-272. https://doi.org/10.11896/jsjkx.210700175
[3] 王馨彤, 王璇, 孙知信.
基于多尺度记忆残差网络的网络流量异常检测模型
Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network
计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[4] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[5] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[6] 冷佳旭, 谭明圮, 胡波, 高新波.
基于隐式视角转换的视频异常检测
Video Anomaly Detection Based on Implicit View Transformation
计算机科学, 2022, 49(2): 142-148. https://doi.org/10.11896/jsjkx.210900266
[7] 刘意, 毛莺池, 程杨堃, 高建, 王龙宝.
基于邻域一致性的异常检测序列集成方法
Locality and Consistency Based Sequential Ensemble Method for Outlier Detection
计算机科学, 2022, 49(1): 146-152. https://doi.org/10.11896/jsjkx.201000156
[8] 张叶, 李志华, 王长杰.
基于核密度估计的轻量级物联网异常流量检测方法
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
[9] 郭奕杉, 刘漫丹.
基于时空轨迹数据的异常检测
Anomaly Detection Based on Spatial-temporal Trajectory Data
计算机科学, 2021, 48(6A): 213-219. https://doi.org/10.11896/jsjkx.201100193
[10] 邢红杰, 郝忠.
基于全局和局部判别对抗自编码器的异常检测方法
Novelty Detection Method Based on Global and Local Discriminative Adversarial Autoencoder
计算机科学, 2021, 48(6): 202-209. https://doi.org/10.11896/jsjkx.200400083
[11] 管文华, 林春雨, 杨尚蓉, 刘美琴, 赵耀.
基于人体关节点的低头异常行人检测
Detection of Head-bowing Abnormal Pedestrians Based on Human Joint Points
计算机科学, 2021, 48(5): 163-169. https://doi.org/10.11896/jsjkx.200800214
[12] 刘立成, 徐一凡, 谢贵才, 段磊.
面向NoSQL数据库的JSON文档异常检测与语义消歧模型
Outlier Detection and Semantic Disambiguation of JSON Document for NoSQL Database
计算机科学, 2021, 48(2): 93-99. https://doi.org/10.11896/jsjkx.200900039
[13] 邹承明, 陈德.
高维大数据分析的无监督异常检测方法
Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis
计算机科学, 2021, 48(2): 121-127. https://doi.org/10.11896/jsjkx.191100141
[14] 石琳姗, 马创, 杨云, 靳敏.
基于SSC-BP神经网络的异常检测算法
Anomaly Detection Algorithm Based on SSC-BP Neural Network
计算机科学, 2021, 48(12): 357-363. https://doi.org/10.11896/jsjkx.201000086
[15] 杨月麟, 毕宗泽.
基于深度学习的网络流量异常检测
Network Anomaly Detection Based on Deep Learning
计算机科学, 2021, 48(11A): 540-546. https://doi.org/10.11896/jsjkx.201200077
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!