Computer Science ›› 2022, Vol. 49 ›› Issue (4): 116-123.doi: 10.11896/jsjkx.210200098

• Database & Big Data & Data Science • Previous Articles     Next Articles

Fast Unsupervised Graph Embedding Based on Anchors

YANG Hui1,2, TAO Li-hong1,2, ZHU Jian-yong1,2, NIE Fei-ping3   

  1. 1 School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330013, China;
    2 Key Laboratory of Advanced Control and Optimization of Jiangxi Province, Nanchang 330013, China;
    3 Center for Optical Image Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an 710072, China
  • Received:2021-02-13 Revised:2021-07-09 Published:2022-04-01
  • About author:YANG Hui,born in 1965,Ph.D,professor,Ph.D supervisor.His main research interests include intelligent transportation system,complex system modeling,control and optimization,process industry integrated automation technology and applications.NIE Fei-ping,born in 1977,Ph.D,professor,Ph.D supervisor.His main research interests include machine lear-ning and its applications,such as pattern recognition,data mining,computer vision,image processing and information retrieval.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(61963015,61733005,61863014),Natural Science Foundation of Jiangxi Province(20171ACB21039,20192BAB207024) and Science and Technology Project of Jiangxi Provincial Department of Education(GJJ150552).

Abstract: Graph embedding is a widely used method for dimensionality reduction due to its computational effectiveness.The computational complexity of graph embedding method to construct traditional K-Nearest Neighbors (K-NN) graph is at least O(n2d), where n and d represents the sample size and dimensions respectively.The construction of K-NN graphs is very time-consuming since the computational complexity is proportional to the square of the number of samples in the case of a large amount of data, which will limit the application of graph embedding algorithms on large-scale data sets.To address this problem, a fast unsupervised graph embedding based on anchors (FUGE) method is proposed in this paper.FUGE first selects anchors (representative points) from the data, then constructs a similarity graph between the data and anchors, and finally performs graph embedding analysis.Since the number of anchors is much smaller than the number of data, the proposed method can effectively reduce the computational complexity of the process of graph construction.Different from using the kernel function to construct the similarity graph, FUGE directly learns the data point-anchor similarity graph by the neighbor information of the data, which further accelerates the process of graph construction.The overall computational complexity of FUGE is O(nd2+nmd), where m is the number of anchors.Extensive experiments on real-world benchmark data sets show the effectiveness and efficiency of the proposed method.

Key words: K-means++, Anchors, Dimensionality reduction, Graph embedding, Orthogonal constraint

CLC Number: 

  • TP391
[1] HE W Q,LIU B L,SUN Z C,et al.Kernel-preserving Embedding Based Subspace Learning[J].Computer Science,2021,48(6):79-85.
[2] TAO Y,BAO L L,HU H.Dimensionality Reduction MethodCombining Representation Learning and Embedded Subspace Learning[J].Computer Engineering,2021,47(6):83-87,97.
[3] ZOU C M,CHEN D.Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis[J].Computer Science,2021,48(2):121-127.
[4] CUNNINGHAM J P,YU B M.Dimensionality reduction forlarge-scale neural recordings[J].Nature Neuroscience,2014,17(11):1500-1509.
[5] WONG W K,LAI Z,WEN J,et al.Low-Rank Embedding for Robust Image Feature Extraction[J].IEEE Trans Image Process,2017,26(6):2905-2917.
[6] LIII P K.On lines and planes of closest fit to systems of points in space[J].The London,Edinburgh,and Dublin Philosophical Magazine and Journal of Science,1901,2(11):559-572.
[7] BELHUMEUR P N,HESPANHA J P,KRIEGMAN D J.Eigenfaces vs.fisherfaces:Recognition using class specific linear projection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7):711-720.
[8] HE X,NIYOGI P.Locality preserving projections[J].Advances in Neural Information Processing Systems,2004,16(16):153-160.
[9] BELKIN M,NIYOGI P.Laplacian Eigenmaps and SpectralTechniques for Embedding and Clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.Cambridge:MIT Press,2001:585-591.
[10] TENENBAUM J B,DE SILVA V,LANGFORD J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323.
[11] ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
[12] YAN S,XU D,ZHANG B,et al.Graph embedding and extensions:A general framework for dimensionality reduction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,29(1):40-51.
[13] CAI D,HE X,HAN J.Spectral regression:A unified subspace learning framework for content-based image retrieval[C]//Proceedings of the 15th ACM international Conference on Multi-media.New York:ACM Press,2007:403-412.
[14] LIU W,HE J,CHANG S F.Large graph construction for scalable semi-supervised learning[C]//Proceedings of the 27th International Conference on Machine Learning.New York:ACM Press,2010:679-686.
[15] CAI D.Compressed spectral regression for efficient nonlinear dimensionality reduction[C]//Proceedings of the 24th International Conference on Artificial Intelligence.California:AAAI Press,2015:3359-3365.
[16] NIE F,ZHU W,LI X.Unsupervised large graph embedding[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.California:AAAI Press,2017:2422-2428.
[17] JIANG R,FU W,WEN L,et al.Dimensionality reduction on anchorgraph with an efficient locality preserving projection[J].Neurocomputing,2016,187:109-118.
[18] YU W,NIE F,WANG F,et al.Fast and flexible large graph embedding based on anchors[J].IEEE Journal of Selected Topics in Signal Processing,2018,12(6):1465-1475.
[19] GOYAL P,FERRARA E.Graph embedding techniques,applications,and performance:A survey[J].Knowledge-Based Systems,2018,151:78-94.
[20] CAI H,ZHENG V W,CHANG K C C.A comprehensive survey of graph embedding:Problems,techniques,and applications[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(9):1616-1637.
[21] GUO Y F,LI S J,YANG J Y,et al.A generalized Foley-Sammon transform based on generalized fisher discriminant criterion and its application to face recognition[J].Pattern Recognition Letters,2003,24(1/2/3):147-158.
[22] WANG H,YAN S,XU D,et al.Trace ratio vs.ratio trace for dimensionality reduction[C]//IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2007:1-8.
[23] JIA Y,NIE F,ZHANG C.Trace ratio problem revisited[J].IEEE Transactions on Neural Networks,2009,20(4):729-735.
[24] FUKUNAGA K.Introduction to statistical pattern recognition[M].Elsevier,2013.
[25] VASSILVITSKII S,ARTHUR D.k-means++:The advantages of careful seeding[C]//Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms.2006:1027-1035.
[26] BACHEM O,LUCIC M,HASSANI S H,et al.Approximate k-means++in sublinear time[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.California:AAAI Press,2016:1459-1467.
[27] NIE F,WANG X,HUANG H.Clustering and projected clustering with adaptive neighbors[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.New York:ACM Press,2014:977-986.
[28] LIU Y,NIE F,GAO Q,et al.Flexible unsupervised feature extraction for image classification[J].Neural Networks,2019,115:65-71.
[29] ZELNIK-MANOR L,PERONA P.Self-tuning spectral clustering[C]//Proceedings of the 17th International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2004:1601-1608.
[30] ESTÉVEZ P A,TESMER M,PEREZ C A,et al.Normalized mutual information feature selection[J].IEEE Transactions on Neural Networks,2009,20(2):189-201.
[1] LI Yong, WU Jing-peng, ZHANG Zhong-ying, ZHANG Qiang. Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism [J]. Computer Science, 2022, 49(4): 43-48.
[2] ZOU Cheng-ming, CHEN De. Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis [J]. Computer Science, 2021, 48(2): 121-127.
[3] XING Chang-zheng, ZHU Jin-xia, MENG Xiang-fu, QI Xue-yue, ZHU Yao, ZHANG Feng, YANG Yi-ming. Point-of-interest Recommendation:A Survey [J]. Computer Science, 2021, 48(11A): 176-183.
[4] LI Xiang-yuan, CAI Cheng, HE Jin-rong. Density Scaling Factor Based ISOMAP Algorithm [J]. Computer Science, 2018, 45(7): 207-213.
[5] HUANG Xuan. Research and Development of Feature Dimensionality Reduction [J]. Computer Science, 2018, 45(6A): 16-21.
[6] ZHANG Meng-lin and LI Zhan-shan. Feature Selection Algorithm Using SAC Algorithm [J]. Computer Science, 2018, 45(2): 63-68.
[7] MA Chao-hong and WENG Xiao-qing. Early Classification of Time Series Based on Piecewise Aggregate Approximation [J]. Computer Science, 2018, 45(2): 291-296.
[8] ZHANG Jian-hui, WANG Hui-qing, SUN Hong-wei, GUO Zhi-rong and BAI Ying-ying. Similarity Measure Algorithm of Time Series Based on Binary-dividing SAX [J]. Computer Science, 2017, 44(1): 247-252.
[9] ZENG An and ZHENG Qi-mi. Deep Belief Networks Research Based on Maximum Information Coefficient [J]. Computer Science, 2016, 43(8): 249-253.
[10] WANG Ya-si, YAO Hong-xun, SUN Xiao-shuai, XU Peng-fei and ZHAO Si-cheng. Representation Ability Research of Auto-encoders in Deep Learning [J]. Computer Science, 2015, 42(9): 56-60.
[11] WEI Jia-yin QIN Yong-bin XU Dao-yun. DRPFSP Algorithm for Solving Permutation Flow Shop Scheduling Problem [J]. Computer Science, 2015, 42(7): 68-73.
[12] LI Yan-yan and YAN De-qin. Dimensionality Reduction Algorithm Based on Neighborhood Rival Linear Embedding [J]. Computer Science, 2015, 42(2): 256-259.
[13] HONG Chao-qun, CHEN Xu-hui, WANG Xiao-dong, LI Shi-jin and WU Ke-shou. Hypergraph Dimensionality Reduction with Multiple Feature Fusion Based on GPU Parallel Acceleration [J]. Computer Science, 2015, 42(11): 90-93.
[14] WANG Wan-liang and CAI Jing. Incremental Learning Algorithm of Non-negative Matrix Factorization with Sparseness Constraints [J]. Computer Science, 2014, 41(8): 241-244.
[15] WANG Juan,DU Hai-shun,HOU Yan-dong and JIN Yong. Graph Embedding Projective Non-negative Matrix Factorization Method for Image Feature Extraction [J]. Computer Science, 2014, 41(8): 311-315.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!