计算机科学 ›› 2022, Vol. 49 ›› Issue (4): 116-123.doi: 10.11896/jsjkx.210200098
杨辉1,2, 陶力宏1,2, 朱建勇1,2, 聂飞平3
YANG Hui1,2, TAO Li-hong1,2, ZHU Jian-yong1,2, NIE Fei-ping3
摘要: 图嵌入降维算法由于其有效性被广泛应用。传统图嵌入算法构造K-Nearest Neighbors(K-NN)图的计算复杂度至少为O(n2d),其中n为样本数,d为样本维度。在数据量大的情况下,构造K-NN图将非常耗时,因为其计算复杂度与样本数的平方成正比,这将限制图嵌入算法在大规模数据集上的应用。为降低构图过程的计算复杂度,提出一种基于锚点的快速无监督图嵌入算法(Fast Unsupervised Graph Embedding Based on Anchors,FUGE)。该算法首先从数据集中选取锚点(代表点),然后构造数据点-锚点相似度图,最后执行图嵌入分析。由于锚点数量远小于数据量,所提方法能有效地降低构图过程的计算复杂度;不同于使用核函数来构造相似度图,该算法直接通过数据点的近邻信息来学习数据点-锚点的相似度图,这进一步加快了构图过程。整个算法的计算复杂度为O(nd2+nmd),其中m为锚点数。在基准数据集上的大量实验证明了所提算法的有效性和高效性。
中图分类号:
[1] HE W Q,LIU B L,SUN Z C,et al.Kernel-preserving Embedding Based Subspace Learning[J].Computer Science,2021,48(6):79-85. [2] TAO Y,BAO L L,HU H.Dimensionality Reduction MethodCombining Representation Learning and Embedded Subspace Learning[J].Computer Engineering,2021,47(6):83-87,97. [3] ZOU C M,CHEN D.Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis[J].Computer Science,2021,48(2):121-127. [4] CUNNINGHAM J P,YU B M.Dimensionality reduction forlarge-scale neural recordings[J].Nature Neuroscience,2014,17(11):1500-1509. [5] WONG W K,LAI Z,WEN J,et al.Low-Rank Embedding for Robust Image Feature Extraction[J].IEEE Trans Image Process,2017,26(6):2905-2917. [6] LIII P K.On lines and planes of closest fit to systems of points in space[J].The London,Edinburgh,and Dublin Philosophical Magazine and Journal of Science,1901,2(11):559-572. [7] BELHUMEUR P N,HESPANHA J P,KRIEGMAN D J.Eigenfaces vs.fisherfaces:Recognition using class specific linear projection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7):711-720. [8] HE X,NIYOGI P.Locality preserving projections[J].Advances in Neural Information Processing Systems,2004,16(16):153-160. [9] BELKIN M,NIYOGI P.Laplacian Eigenmaps and SpectralTechniques for Embedding and Clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.Cambridge:MIT Press,2001:585-591. [10] TENENBAUM J B,DE SILVA V,LANGFORD J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323. [11] ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326. [12] YAN S,XU D,ZHANG B,et al.Graph embedding and extensions:A general framework for dimensionality reduction[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,29(1):40-51. [13] CAI D,HE X,HAN J.Spectral regression:A unified subspace learning framework for content-based image retrieval[C]//Proceedings of the 15th ACM international Conference on Multi-media.New York:ACM Press,2007:403-412. [14] LIU W,HE J,CHANG S F.Large graph construction for scalable semi-supervised learning[C]//Proceedings of the 27th International Conference on Machine Learning.New York:ACM Press,2010:679-686. [15] CAI D.Compressed spectral regression for efficient nonlinear dimensionality reduction[C]//Proceedings of the 24th International Conference on Artificial Intelligence.California:AAAI Press,2015:3359-3365. [16] NIE F,ZHU W,LI X.Unsupervised large graph embedding[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.California:AAAI Press,2017:2422-2428. [17] JIANG R,FU W,WEN L,et al.Dimensionality reduction on anchorgraph with an efficient locality preserving projection[J].Neurocomputing,2016,187:109-118. [18] YU W,NIE F,WANG F,et al.Fast and flexible large graph embedding based on anchors[J].IEEE Journal of Selected Topics in Signal Processing,2018,12(6):1465-1475. [19] GOYAL P,FERRARA E.Graph embedding techniques,applications,and performance:A survey[J].Knowledge-Based Systems,2018,151:78-94. [20] CAI H,ZHENG V W,CHANG K C C.A comprehensive survey of graph embedding:Problems,techniques,and applications[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(9):1616-1637. [21] GUO Y F,LI S J,YANG J Y,et al.A generalized Foley-Sammon transform based on generalized fisher discriminant criterion and its application to face recognition[J].Pattern Recognition Letters,2003,24(1/2/3):147-158. [22] WANG H,YAN S,XU D,et al.Trace ratio vs.ratio trace for dimensionality reduction[C]//IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2007:1-8. [23] JIA Y,NIE F,ZHANG C.Trace ratio problem revisited[J].IEEE Transactions on Neural Networks,2009,20(4):729-735. [24] FUKUNAGA K.Introduction to statistical pattern recognition[M].Elsevier,2013. [25] VASSILVITSKII S,ARTHUR D.k-means++:The advantages of careful seeding[C]//Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms.2006:1027-1035. [26] BACHEM O,LUCIC M,HASSANI S H,et al.Approximate k-means++in sublinear time[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.California:AAAI Press,2016:1459-1467. [27] NIE F,WANG X,HUANG H.Clustering and projected clustering with adaptive neighbors[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.New York:ACM Press,2014:977-986. [28] LIU Y,NIE F,GAO Q,et al.Flexible unsupervised feature extraction for image classification[J].Neural Networks,2019,115:65-71. [29] ZELNIK-MANOR L,PERONA P.Self-tuning spectral clustering[C]//Proceedings of the 17th International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2004:1601-1608. [30] ESTÉVEZ P A,TESMER M,PEREZ C A,et al.Normalized mutual information feature selection[J].IEEE Transactions on Neural Networks,2009,20(2):189-201. |
[1] | 李勇, 吴京鹏, 张钟颖, 张强. 融合快速注意力机制的节点无特征网络链路预测算法 Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism 计算机科学, 2022, 49(4): 43-48. https://doi.org/10.11896/jsjkx.210800276 |
[2] | 赵志强, 易秀双, 李婕, 王兴伟. 基于GR-AD-KNN算法的IPv6网络DoS入侵检测技术研究 Research on DoS Intrusion Detection Technology of IPv6 Network Based on GR-AD-KNN Algorithm 计算机科学, 2021, 48(6A): 524-528. https://doi.org/10.11896/jsjkx.200500001 |
[3] | 刘嘉琛, 秦小麟, 朱润泽. 基于LSTM-Attention的RFID移动对象位置预测 Prediction of RFID Mobile Object Location Based on LSTM-Attention 计算机科学, 2021, 48(3): 188-195. https://doi.org/10.11896/jsjkx.200600134 |
[4] | 邹承明, 陈德. 高维大数据分析的无监督异常检测方法 Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis 计算机科学, 2021, 48(2): 121-127. https://doi.org/10.11896/jsjkx.191100141 |
[5] | 邢长征, 朱金侠, 孟祥福, 齐雪月, 朱尧, 张峰, 杨一鸣. 兴趣点推荐方法研究综述 Point-of-interest Recommendation:A Survey 计算机科学, 2021, 48(11A): 176-183. https://doi.org/10.11896/jsjkx.201100021 |
[6] | 杨力, 李欣宇, 石怀峰, 潘成胜. 空间信息网络任务智能识别方法 Task Intelligent Identification Method for Spatial Information Network 计算机科学, 2020, 47(4): 262-269. https://doi.org/10.11896/jsjkx.190300111 |
[7] | 吴丹丹, 吕鑫. 分布式结构下基于用户协作的匿名区域构建算法 Location Anonymous Algorithm Based on User Collaboration under Distributed Structure 计算机科学, 2019, 46(4): 158-163. https://doi.org/10.11896/j.issn.1002-137X.2019.04.025 |
[8] | 贾旭, 孙福明, 李豪杰, 曹玉东. 基于有监督双正则NMF的静脉识别算法 Vein Recognition Algorithm Based on Supervised NMF with Two Regularization Terms 计算机科学, 2018, 45(8): 283-287. https://doi.org/10.11896/j.issn.1002-137X.2018.08.051 |
[9] | 李香元,蔡骋,何进荣. 基于密度缩放因子的ISOMAP算法 Density Scaling Factor Based ISOMAP Algorithm 计算机科学, 2018, 45(7): 207-213. https://doi.org/10.11896/j.issn.1002-137X.2018.07.036 |
[10] | 黄铉. 特征降维技术的研究与进展 Research and Development of Feature Dimensionality Reduction 计算机科学, 2018, 45(6A): 16-21. |
[11] | 张志禹, 刘思媛. 一种基于Curv-SAE特征融合的人脸降维和识别方法 Method of Face Recognition and Dimension Reduction Based on Curv-SAE Feature Fusion 计算机科学, 2018, 45(10): 267-271. https://doi.org/10.11896/j.issn.1002-137X.2018.10.049 |
[12] | 首照宇,杨晓帆. 联合Gabor误差字典和低秩表示的人脸识别算法 Jointing Gabor Error Dictionary and Low Rank Representation for Face Recognition 计算机科学, 2017, 44(3): 296-299. https://doi.org/10.11896/j.issn.1002-137X.2017.03.060 |
[13] | 张建辉,王会青,孙宏伟,郭芷榕,白莹莹. 基于二分迭代SAX的时序相似性度量算法 Similarity Measure Algorithm of Time Series Based on Binary-dividing SAX 计算机科学, 2017, 44(1): 247-252. https://doi.org/10.11896/j.issn.1002-137X.2017.01.046 |
[14] | 梅玲玲,龚劬. 基于改进的自适应局部保持投影算法的人脸识别 Face Recognition Based on Improved Adaptive Locality Preserving Projection 计算机科学, 2016, 43(8): 286-291. https://doi.org/10.11896/j.issn.1002-137X.2016.08.058 |
[15] | 曾安,郑齐弥. 基于MIC的深度置信网络研究 Deep Belief Networks Research Based on Maximum Information Coefficient 计算机科学, 2016, 43(8): 249-253. https://doi.org/10.11896/j.issn.1002-137X.2016.08.050 |
|