Computer Science ›› 2020, Vol. 47 ›› Issue (9): 99-104.doi: 10.11896/jsjkx.200600170

• Database & Big Data & Data Science • Previous Articles     Next Articles

High-order Multi-view Outlier Detection

ZHONG Ying-yu, CHEN Song-can   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2020-06-28 Published:2020-09-10
  • About author:ZHONG Ying-yu,born in 1995,postgraduate.His main research interests include multi-view learning and anomaly detection.
    CHEN Song-can,born in 1962,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include pattern recognition,machine learning and neural computing.
  • Supported by:
    Key Program of National Natural Science Foundation of China (61732006).

Abstract: Due to the complex distribution of data between different views,the traditional single-view outlier detection method is no longer applicable to the detection of multi-view outliers,making multi-view outlier detection a challenging research topic.Multi-view outliers can be divided into three types:attribute outliers,class outliers,and class-attribute outliers.Existing methods use pairwise constraints across views to learn new feature representations and define outlier scoring metrics based on these features,which do not take full advantage of the interactive information between views and results in higher computational complexity when facing three or more views.Therefore,this paper considers to reshape multi-view data into tensor set form,defines high-order multi-view outliers,and proves that all of the existing three types of multi-view outliers meet the definition of high-order multi-view outliers,so as to propose a new multi-view outliers detection algorithm called high-order multi-view outliers detection algorithm (HOMVOD).Specifically,the algorithm firstly reshapes multi-view data into tensor set form,then learns its low-rank representation,and finally designs outlier function under tensor representation to realize detection.Experiments on UCI datasets show that this method is superior to existing methods in detecting multi-view outliers.

Key words: Multi-view outlier detection, Multi-view learning, Anomaly detection, Tensor representation, Low-rank representation

CLC Number: 

  • TP181
[1] WEST J,BHATTACHARYA M.Intelligent financial fraud detection:a comprehensive review [J].Computers & Security,2016,57:47-66.
[2] BAHNSEN A C,AOUADA D,STOJANOVIC A,et al.Feature engineering strategies for credit card fraud detection [J].Expert Systems with Applications,2016,51:134-142.
[3] HUANG S Y,LIN C C,CHIU A A,et al.Fraud detection usingfraud triangle risk factors[J].Information Systems Frontiers,2017,19(6):1343-1356.
[4] SHUAIB M,OSHO O,ISMAILA I,et al.Comparative analysis of classification algorithms for email spam detection [J].International Journal of Computer Network and Information Security,2018,10(1):60.
[5] COLUCCIA A,DÁLCONZO A,RICCIATO F.Distribution-based anomaly detection via generalized likelihood ratio test:A general maximum entropy approach [J].Computer Networks,2013,57(17):3446-3462.
[6] VU N H,GOPALKRISHNAN V,ASSENT I.An UnbiasedDistance-Based Outlier Detection Approach for High-Dimensional Data[C]//Database Systems for Advanced Applications - 16th International Conference(DASFAA 2011).Hong Kong,China,2011.
[7] YU H,WANG B,XIAO G,et al.Distance-based outlier detection on uncertain data [J].Journal of Computer Research & Development,2010,1(3):293-298.
[8] RADOVANOVIC M,NANOPOULOS A,IVANOVIC M.Re-verse nearest neighbors in unsupervised distance-based outlier detection [J].IEEE Transactions on Knowledge & Data Engineering,2015,27(5):1369-1382.
[9] ZHANG Z,ZHU M,QIU J,et al.Outlier detection based on cluster outlier factor and mutual density[J].International Journal of Intelligent Information and Database Systems,2019,12(1/2):91-108.
[10] TANG B,HE H.A local density-based approach for outlier detection [J].Neurocomputing,2017,241:171-180.
[11] MISHRA G,AGARWAL S,JAIN P K,et al.Outlier detection using subset formation of clustering-based method[C]//International Conference on Advanced Computing Networking and Informatics.Singapore:Springer,2019:521-528.
[12] AZHAR F.Fuzzy clustering-based semi-supervised approachfor outlier detection in big text data [J].Progress in Artificial Intelligence,2019,8(1):123-132.
[13] LI X,CHEN S.A Concise yet Effective model for Non-Aligned Incomplete Multi-view and Missing Multi-Label Learning [J].arXiv:2005.00976,2020.
[14] HU M,CHEN S.Doubly aligned incomplete multi-view clustering [J].arXiv:1903.02785,2019.
[15] WANG Z,XU J,CHEN S,et al.Regularized multi-view learning machine based on response surface technique [J].Neurocompu-ting,2012,97:201-213.
[16] QIAN Q,CHEN S,ZHOU X.Multi-view classification withcross-view must-link and cannot-link side information [J].Knowledge-Based Systems,2013,54:137-146.
[17] GAO J,FAN W,TURAGA D,et al.A spectral framework for detecting inconsistency across multi-source objects relationships[C]//2011 IEEE 11-th International Conference on Data Mi-ning.IEEE,2011:1050-1055.
[18] MARCOS A A,YAMADA M,KIMURA A,et al.Clustering-based anomaly detection in multi-view data[C]//Proceedings of the 22nd ACM international conference on Information & Knowledge Management.2013:1545-1548.
[19] LI S,SHAO M,FU Y.Multi-view low-rank analysis for outlier detection[C]//Proceedings of the 2015 SIAM International Conference on Data Mining.SIAM,2015:748-756.
[20] ZHAO H,FU Y.Dual-regularized multi-view outlier detection[C]//Twenty-Fourth International Joint Conference on Artificial Intelligence.2015.
[21] LI K,LI S,DING Z,et al.Latent discriminant subspace representations for multi-view outlier detection[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[22] LUO Y,TAO D,RAMAMOHANARAO K,et al.Tensor canonical correlation analysis for multi-view dimension reduction [J].IEEE transactions on Knowledge and Data Engineering,2015,27(11):3111-3124.
[23] CAO B,HE L,KONG X,et al.Tensor-based Multiview feature selection with applications to brain diseases[C]//2014 IEEE International Conference on Data Mining.IEEE,2014:40-49.
[24] BU F.A high-order clustering algorithm based on dropout deep learning for heterogeneous data in cyber-physical-social systems [J].IEEE Access,2017,6:11687-11693.
[25] LI C G,VIDAL R.Structured sparse subspace clustering:A unified optimization framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:277-286.
[26] LIN Z,CHEN M,MA Y.The augmented Lagrange multipliermethod for exact recovery of corrupted low-rank matrices [J].arXiv:1009.5055,2010.
[27] CAI J F,CANDÈS E J,SHEN Z.A singular value thresholding algorithm for matrix completion [J].SIAM Journal on optimization,2010,20(4):1956-1982.
[28] LIU G,LIN Z,YU Y.Robust subspace segmentation by low-rank representation[C]//Proceedings of the 27th International Conference on Machine Learning (ICML-10).DBLP,2010:663-670.
[29] LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):171-184.
[1] ZHANG Fan, HE Wen-qi, JI Hong-bing, LI Dan-ping, WANG Lei. Multi-view Dictionary-pair Learning Based on Block-diagonal Representation [J]. Computer Science, 2021, 48(1): 233-240.
[2] FENG An-ran, WANG Xu-ren, WANG Qiu-yun, XIONG Meng-bo. Database Anomaly Access Detection Based on Principal Component Analysis and Random Tree [J]. Computer Science, 2020, 47(9): 94-98.
[3] CHEN Jia,OUYANG Jin-yuan,FENG An-qi,WU Yuan,QIAN Li-ping. DoS Anomaly Detection Based on Isolation Forest Algorithm Under Edge Computing Framework [J]. Computer Science, 2020, 47(2): 287-293.
[4] ZHANG Xian,YE Jun. Hyperspectral Images Denoising Based on Non-local Similarity Joint Low-rank Representation [J]. Computer Science, 2020, 47(1): 170-175.
[5] DU Zhen, MA Li-peng, SUN Guo-zi. Network Traffic Anomaly Detection Based on Wavelet Analysis [J]. Computer Science, 2019, 46(8): 178-182.
[6] JIANG Hua,WU Yao,WANG Xin,WANG Hui-jiao. Study on Ocean Data Anomaly Detection Algorithm Based on Improved K-means Clustering [J]. Computer Science, 2019, 46(7): 211-216.
[7] ZHAO Bo, ZHANG Hua-feng, ZHANG Xun, ZHAO Jin-xiong, SUN Bi-ying, YUAN Hui. EMD-based Anomaly Detection for Network Traffic in Power Plants [J]. Computer Science, 2019, 46(11A): 464-468.
[8] WEN Wen, CHEN Ying, CAI Rui-chu, HAO Zhi-feng, WANG Li-juan. Emotion Classification for Readers Based on Multi-view Multi-label Learning [J]. Computer Science, 2018, 45(8): 191-197.
[9] SONG Zhan-wei, ZHOU Rui-kang, LAI Ying-xu, FAN Ke-feng, YAO Xiang-zhen, LI Lin and LI Wei. Anomaly Detection Method of ICS Based on Behavior Model [J]. Computer Science, 2018, 45(1): 233-239.
[10] FEI Peng, LIN Hong-fei, YANG Liang, XU Bo and Gulziya ANIWAR. Multi-view Ensemble Framework for Constructing User Profile [J]. Computer Science, 2018, 45(1): 179-182.
[11] WU Jing-feng, JIN Wei-dong and TANG Peng. Survey on Monitoring Techniques for Data Abnormalities [J]. Computer Science, 2017, 44(Z11): 24-28.
[12] ZHOU Xian-ting, HUANG Wen-ming and DENG Zhen-rong. Micro-blog Retweet Behavior Prediction Algorithm Based on Anomaly Detection and Random Forest [J]. Computer Science, 2017, 44(7): 191-196.
[13] SUN Qiang, WEI Wei, HOU Pei-xin and YUE Ji-guang. Anomaly Detection Based on Interval One Cluster and Classification [J]. Computer Science, 2017, 44(6): 189-198.
[14] YIN Na and ZHANG Lin. Research on Application of Outlier Mining Based on Hybrid Clustering Algorithm in Anomaly Detection [J]. Computer Science, 2017, 44(5): 116-119.
[15] SHOU Zhao-yu and YANG Xiao-fan. Jointing Gabor Error Dictionary and Low Rank Representation for Face Recognition [J]. Computer Science, 2017, 44(3): 296-299.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[3] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[4] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[5] HAN Kui-kui, XIE Zai-peng and LV Xin. Fog Computing Task Scheduling Strategy Based on Improved Genetic Algorithm[J]. Computer Science, 2018, 45(4): 137 -142 .
[6] ZHENG Xiu-lin, SONG Hai-yan and FU Yi-peng. Distinguishing Attack of MORUS-1280-128[J]. Computer Science, 2018, 45(4): 152 -156 .
[7] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[8] ZHU Shu-qin, WANG Wen-hong and LI Jun-qing. Chosen Plaintext Attack on Chaotic Image Encryption Algorithm Based on Perceptron Model[J]. Computer Science, 2018, 45(4): 178 -181 .
[9] GUO Shuai, LIU Liang and QIN Xiao-lin. Spatial Keyword Range Query with User Preferences Constraint[J]. Computer Science, 2018, 45(4): 182 -189 .
[10] DING Shu-yang, LI Bing and SHI Hong-bo. Study on Flexible Job-shop Scheduling Problem Based on Improved Discrete Particle Swarm Optimization Algorithm[J]. Computer Science, 2018, 45(4): 233 -239 .