Computer Science ›› 2018, Vol. 45 ›› Issue (7): 207-213.doi: 10.11896/j.issn.1002-137X.2018.07.036

• Artificial Intelligence • Previous Articles     Next Articles

Density Scaling Factor Based ISOMAP Algorithm

LI Xiang-yuan, CAI Cheng, HE Jin-rong   

  1. College of Information Engineering,Northwest A & F University,Yangling,Shaanxi 712100,China
  • Received:2017-04-19 Online:2018-07-30 Published:2018-07-30

Abstract: ISOMAP is a widely used nonlinear unsupervised dimensionality reduction algorithm,and preserves the coor-dinate transformation from high-dimensional space to low-dimensional space by keeping the geodesic distance between the observation samples.However,practical data are inevitably corrupted by noise.Since the calculation of geodesic distance is sensitive to noise and does not consider the density distribution of dataset,the geometric deformation of the ISOMAP algorithm is happening after dimensionality reduction.For this shortcoming,according to the idea of local density,a density scaling factor based ISOMAP algorithm called D-ISOMAP was proposed.In the framework of traditional ISOMAP algorithm,local density scaling factor is first calculated for each observation sample.And then the original geo-desic distance of two samples is divided by the product of their density scaling factors.Finally,the improved distance matrix is obtained by the shortest path algorithm.Since the improved geodesic distance is reduced in the larger density region and enlarged in the smaller density region,it can achieve a good effect in noisy situations.The experimental results on artificial dataset and UCI dataset show that the D-ISOMAP algorithm is better than the classical unsupervised dimensionality reduction algorithms in terms of visualization and clustering of data sets.

Key words: Density scaling factor, Dimensionality reduction, ISOMAP, Manifold learning, Noise data

CLC Number: 

  • TP181
[1]LEW M S,SEBE N,DJERABA C,et al.Content-based multimedia information retrieval:State of the art and challenges[J].ACM Transactions on Multimedia Computing,Communications,and Applications (TOMM),2006,2(1):1-19.
[2]ZHANG T,TANG Y Y,FANG B,et al.Document Clustering in Correlation Similarity Measure Space[J].IEEE Transactions on Knowledge & Data Engineering,2012,24(6):1002-1013.
[3]BAR-JOSEPH Z,GITTER A,SIMON I.Studying and modelling dynamic biological processes using time-series gene expression data[J].Nature Reviews Genetics,2012,13(8):552.
[4]DONOHO D L.High-Dimensional Data Analysis:The Cursesand Blessings of Dimensionality[C]∥Lecture — Math Challenges of Century.2000:178-183.
[5]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
[6]H RDLE W K,HL VKA Z.Principal Component Analysis[J].IEEE Trans on Automatic Control,2015,29(1):163-183.
[7]AMARI S I,CICHOCKI A.A new learning algorithm for blind source separation[C]∥Advances in Neural Information Processing Systems.1996:757-763.
[8]MARTINEZ A M,KAK A C.PCA versus LDA[J].IEEETransactions on Pattern Analysis & Machine Intelligence,2001,23(2):228-233.
[9]COX M AA,COX T F.Multidimensional Scaling[J].Journal of the Royal Statistical Society,2008,46(2):1050-1057.
[10]TENENBAUM J B,SILVA V D,LANGFORD J C.A Global Geometric Framework for Nonlinear Dimensionality Reduction[J].Science,2000,290(5500):2319-2323.
[11]YANG J,FRANGI A F,YANG J,et al.KPCA plus LDA:acomplete kernel Fisher discriminant framework for feature extraction and recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(2):230-244.
[12]YU J,KIM S B.Density-based geodesic distance for identifying the noisy and nonlinear clusters[J].Information Sciences,2016,360:231-243.
[13]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496.
[14]GENG X,ZHAN D C,ZHOU Z H.Supervised nonlinear dimensionality reduction for visualization and classification[J].IEEE Transactions on Systems Man & Cybernetics Part B Cyberne-tics,2005,35(6):1098-1107.
[15]ZHANG Z,CHOW T W S,ZHAO M.M-Isomap:Orthogonalconstrained marginal isomap for nonlinear dimensionality reduction[J].IEEE Transactions on Cybernetics,2013,43(1):180-191.
[16]YANG B,XIANG M,ZHANG Y.Multi-manifold Discriminant Isomap for visualization and classification[J].Pattern Recognition,2016,55:215-230.
[17]HE J R,DING L X,LI Z K,et al.Margin Discriminant Projection for Dimensionality Reduction[J].Journal of Software,2014,25(4):826-838.(in Chinese)
何进荣,丁立新,李照奎,等.基于边界判别投影的数据降维[J].软件学报,2014,25(4):826-838.
[18]BERNSTEIN M,SILVA V D,LANGFORD J C,et al.Graph Approximations to Geodesics on Embedded Manifolds[J].Stanford University,2001,24(9):153-158.
[19]MAIER M,LUXBURG U V,HEIN M.Influence of graph construction on graph-based clustering[C]∥Conference on Neural Information Processing Systems.Vancouver,British Columbia,Canada,DBLP,2008:1025-1032.
[20]FU Y,MA Y.Graph Embedding for Pattern Analysis[M].New York:Springer,2013.
[21]LEE J S,OLAFSSON S.Data clustering by minimizing disconnectivity[J].Information Sciences,2011,181(4):732-746.
[1] YANG Hui, TAO Li-hong, ZHU Jian-yong, NIE Fei-ping. Fast Unsupervised Graph Embedding Based on Anchors [J]. Computer Science, 2022, 49(4): 116-123.
[2] CHEN Chang-wei, ZHOU Xiao-feng. Fast Local Collaborative Representation Based Classifier and Its Applications in Face Recognition [J]. Computer Science, 2021, 48(9): 208-215.
[3] BAI Zi-yi, MAO Yi-rong , WANG Rui-ping. Survey on Video-based Face Recognition [J]. Computer Science, 2021, 48(3): 50-59.
[4] ZOU Cheng-ming, CHEN De. Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis [J]. Computer Science, 2021, 48(2): 121-127.
[5] YANG Zhang-jing, WANG Wen-bo, HUANG Pu, ZHANG Fan-long, WANG Xin. Local Weighted Representation Based Linear Regression Classifier and Face Recognition [J]. Computer Science, 2021, 48(11A): 351-359.
[6] ZHANG Jun, WANG Yang, LI Kun-hao, LI Chang, ZHAO Chuan-xin. Multi-source Sensor Body Area Network Data Fusion Model Based on Manifold Learning [J]. Computer Science, 2020, 47(8): 323-328.
[7] HUANG Xuan. Research and Development of Feature Dimensionality Reduction [J]. Computer Science, 2018, 45(6A): 16-21.
[8] MA Chao-hong and WENG Xiao-qing. Early Classification of Time Series Based on Piecewise Aggregate Approximation [J]. Computer Science, 2018, 45(2): 291-296.
[9] ZHANG Meng-lin and LI Zhan-shan. Feature Selection Algorithm Using SAC Algorithm [J]. Computer Science, 2018, 45(2): 63-68.
[10] ZHANG Shao-qun. Manifold Learning Algorithm Based on Compact Setsub-coverage [J]. Computer Science, 2017, 44(Z6): 88-91.
[11] XUE Xiao-yu and MA Xiao-hu. Double Adjacency Graphs Based Orthogonal Neighborhood Preserving Projections for Face Recognition [J]. Computer Science, 2017, 44(8): 31-35.
[12] YE Jun and JIN Zhong. Hypergraph Dual Regularization Concept Factorization Algorithm and Its Application in Data Representation [J]. Computer Science, 2017, 44(7): 309-314.
[13] ZHANG Jian-hui, WANG Hui-qing, SUN Hong-wei, GUO Zhi-rong and BAI Ying-ying. Similarity Measure Algorithm of Time Series Based on Binary-dividing SAX [J]. Computer Science, 2017, 44(1): 247-252.
[14] REN Ying-chun, WANG Zhi-cheng, CHEN Yu-fei, ZHAO Wei-dong and PENG Lei. Fast Feature Extraction Algorithm Based on Manifold Learning and Sparsity Constraints [J]. Computer Science, 2016, 43(8): 277-281.
[15] ZENG An and ZHENG Qi-mi. Deep Belief Networks Research Based on Maximum Information Coefficient [J]. Computer Science, 2016, 43(8): 249-253.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!