Computer Science ›› 2013, Vol. 40 ›› Issue (8): 181-185.

Previous Articles     Next Articles

NLOF:A New Density-based Local Outlier Detecting Algorithm

WANG Jing-hua,ZHAO Xin-xiang,ZHANG Guo-yan and LIU Jian-yin   

  • Online:2018-11-16 Published:2018-11-16

Abstract: The time complexity of the density-based outlier detecting algorithm (LOF algorithm) is not ideal,which effects its applications in large scale datasets and high dimensional datasets.Under such circumstances,a new density-based outlier detecting algorithm (NLOF algorithm) was introduced.The main idea of the NLOF algorithm is as follows:the known information is used as much as possible to optimize the neighborhood query operation of adjacent objects in the process of neighborhood searching of a data object.This method is adopted in neighborhood computing and searching in this paper.Firstly,clustering algorithm is taken as a preprocessing,and local outlier factors are calculated only for the data objects out of clusters.Secondly,the local outlier factors for the data objects out of clusters are calculated in the same way as calculating the local outlier factors in LOF algorithm.Leave-one partition information gain is introduced in the process of calculating the local outlier factor.The weight of attribute is determined by leave-one partition information gain.The weighted distance is used in calculating distances between objects.Extensive experimental results show the advantages of the proposed method.NLOF algorithm can improve the outlier detection accuracy,reduce the time complexity and realize the effective local outlier detection.

Key words: Data mining,Outlier detection,Information entropy,Clustering

[1] Hawkins D.Identification of Outliers [M].Londen:Chapman and Hall,1980:188
[2] Han Sang-jun,Cho S-B,et al.Evolutionary Neural Networks for Anomaly Detection Based on the Behavior of a Program [J].IEEE Transactions on Systems,Man,and Cybernetics-Part B:Cybernetics,2006,36(3):559-570
[3] Ramaswamy S,Rastogi R,Shim K.Efficient algorithms for mi-ning outliers from large data sets[J].ACM Sigmoid Record,2000,9(2):427-438
[4] Hung Wen-liang,Yang Min-shen.An Omission Approach forDetecting Outliers in Fuzzy Regression Models [J].Fuzzy Sets and Systems,2006,157(23):3109-3122
[5] Liu Xiao-hui,Cheng Gong-xian,Wu J X.Analyzing OutliersCautiously[J].IEEE Transactions on Knowledge and Data Engineering,2002,14(2):432-437
[6] Breunig M,Kriegel H P,Ng R,et al.LOF:Identifying Density-based Local Outliers,2000[C]∥Proc.of the ACM SIGMOD International Conference on Management of Data.[s.1.]:ACM press,2000:93-104
[7] Tang J,Chen Z,Fu A,et al.Enhancing effectiveness of outlier detections for low-density patterns,2002[C]∥Proceeding of Advances in Knowledge Discovery and Data Mining 6th Pacific Asia Conference,Lecture Notes in Computer Science.Taipei,China,2002:535-548
[8] Ni Wei-wei,Chen Geng,Lu Jie-ping,et al.Local Entropy BasedWeighted Subspace Outlier Mining Algorithm[J].Journal of Computer Research Development,2008,45(7):1189-1194
[9] Papadimitirou S,Kitagawa H,Gibbons P B,et al.LOCI:Fastoutlier detection using the local correlation integral [C]∥Proc of the 19th Int Conf on Data Engineering.Los Alamitos:IEEE Computer Society,2003:315-326
[10] 薛安荣,鞠时光,何伟华,等.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463
[11] 胡彩平,秦小麟.一种基于密度的局部离群点检测算法DLOF[J].计算机研究与发展,2010,7(12):2110-2116
[12] 张净,孙志挥,等.基于信息论的高维海量数据离群点挖掘[J].计算机科学,2011,8(7):148-161

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!