计算机科学 ›› 2023, Vol. 50 ›› Issue (2): 138-145.doi: 10.11896/jsjkx.220400230

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于稀疏连接的层次化多核K-Means算法

王雷1,2, 杜亮1,2, 周芃3   

  1. 1 山西大学计算机与信息技术学院 太原 030006
    2 山西大学大数据科学与产业研究院 太原 030006
    3 安徽大学计算机科学与技术学院 合肥 230601
  • 收稿日期:2022-04-24 修回日期:2022-10-27 出版日期:2023-02-15 发布日期:2023-02-22
  • 通讯作者: 杜亮(duliang@sxu.edu.cn)
  • 作者简介:(575264909@qq.com)
  • 基金资助:
    国家自然科学基金面上项目(61976129,62176001);山西省青年科技研究基金(201901D211168)

Hierarchical Multiple Kernel K-Means Algorithm Based on Sparse Connectivity

WANG Lei1,2, DU Liang1,2, ZHOU Peng3   

  1. 1 College of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China
    3 College of Computer Science and Technology,Anhui University,Hefei 230601,China
  • Received:2022-04-24 Revised:2022-10-27 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    National Natural Science Foundation of China(61976129,62176001) and Natural Science Foundation for Young Scientists of Shanxi Province,China(201901D211168)

摘要: 多核学习(Multiple Kernel Learning,MKL)的目标是寻找一个最优的一致性核函数。在层次化多核聚类算法(HMKC)中,通过从高维空间中对样本特征进行逐层提取的方式来实现最大化地保留有效信息,但是却忽略了层与层之间的信息交互。该模型中只有相邻层中对应的结点会进行信息交互,对于其他结点来说是孤立的,而采用全连接的方式又会削弱最终一致性矩阵的多样性。因此,文中提出了一种基于稀疏连接的层次化多核K-Means算法(Sparse Connectivity Hierarchical Multiple Kernel K-Means,SCHMKKM)。该算法通过稀疏率来控制分配矩阵以达到稀疏连接的效果,从而将层与层之间信息蒸馏得到的特征进行局部融合。最后,在多个数据集上进行聚类分析,并在实验中与全连接的层次化多核K-Means算法(FCHMKKM)进行实验对比,证明了具有更多差异性的信息融合有利于学习更好的一致性划分矩阵,并且稀疏连接的融合策略优于全连接的策略。

关键词: 多核学习, 层次化多核聚类, 稀疏连接, 全连接, 信息蒸馏 , 局部融合

Abstract: Multiple kernel learning(MKL) aims to find an optimal consistent kernel function.In the hierarchical multiple kernel clustering(HMKC) algorithm,the sample features are extracted layer by layer from high-dimensional space to maximize the retention of effective information,but the information interaction between layers is ignored.In this model,only the corresponding nodes in the adjacent layer will exchange information,but for other nodes,it is isolated,and if the full connection is adopted,the diversity of the final consistence matrix will be reduced.Therefore,this paper proposes a hierarchical multiple kernel K-Means(SCHMKKM) algorithm based on sparse connectivity,which controls the assignment matrix to achieve the effect of sparse connections through the sparsity rate,thereby locally fusing the features obtained by the distillation of information between layers.Finally,we perform cluster analysis on multiple data sets and compare it with the fully connected hierarchical multiple kernel K-Means(FCHMKKM) algorithm in experiment.Finally,it is proved that more discriminative information fusion is beneficial to learn a better consistent partition matrix,and the fusion strategy of sparse connection is better than the strategy of full connection.

Key words: Multiple kernel learning, Hierarchical multiple kernel clustering, Sparse connectivity, Fully connected, Information distillation, Local fusion

中图分类号: 

  • TP181
[1]MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.California:University of California Press,1967:281-297.
[2]SCHÖLKOPF B,SMOLA A,MÜLLER K R.Nonlinear component analysis as a kernel eigenvalue problem[J].Neural Computation,1998,10(5):1299-1319.
[3]JIAO R H,LIU S L,WEN W,et al.Incremental kernel fuzzy c-means with optimizing cluster center initialization and delivery[J].Kybernetes:The International Journal of Systems and Cybernetics,2016,45(8):1273-1291.
[4]KANG Z,PENG C,CHENG Q,et al.Unified spectral clustering with optimal graph[C]//Thirty-Second AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2017:3366-3373.
[5]SHI Y,TRANCHEVENT L,LIU X H,et al.Optimized data fusion for kernel k-means clustering.[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(5):1031-1039.
[6]XU Z L,JIN R,KING I,et al.An extended level method for efficient multiple kernel learning[C]//Advances in Neural Information Processing Systems 21.Massachusetts:MIT Press,2009:1825-1832.
[7]DU L,ZHOU P,SHI L,et al.Robust multiple kernel k-means using l21-norm[C]//Proceedings of the 24th International Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2015:3476-3482.
[8]PATEL V M,VIDAL R.Kernel sparse subspace clustering[C]//2014 IEEE International Conference on Image Processing(ICIP).Piscataway:IEEE Press,2014:2849-2853.
[9]SUN M J,WANG S W,ZHANG P,et al.Projective Multiple Kernel Subspace Clustering[J].IEEE Transactions on Multimedia,2021,2567-2579.
[10]LIU J Y,LIU X W,WANG S W,et al.Hierarchical Multiple Kernel Clustering[C]//Proceedings of the Thirty-fifth AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2021:8671-8679.
[11]EN Z,ZHOU S H,WANG Y Q,et al.Optimal Neighborhood Kernel Clustering with Multiple Kernels[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2017:2266-2272.
[12]LIU J Y,LIU X W,XIONG J,et al.Optimal NeighborhoodMultiple Kernel Clustering with Adaptive Local Kernels[J].IEEE Transactions on Knowledge and Data Engineering,2021,34(6):2872-2885.
[13]SPRINGENBERG J T,DOSOVITSKIY A,BROX T,et al.Striving for simplicity:The all convolutional net[J].arXiv:1412.6806,2014.
[14] CONSTANTIN M D,ELENA M,PETER S,et al.Scalabletraining of artificial neural networks with adaptive sparse connectivity inspired by network science[J].Nature Communications,2018,9(1):2383-2383.
[15]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[16]LIU X W,DOU Y,YIN J P,et al.2016.Multiple Kernel k-Means Clustering with Matrix-Induced Regularization[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2016:1888-1894.
[17]LI M M,LIU X W,WANG L,et al.Multiple Kernel Clustering with Local Kernel Alignment Maximization[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.San Francisco:Morgan Kaufmann Press,2016:1704-1710.
[18]KANG Z,WEN L J,CHEN W Y,et al.Low-rank kernel lear-ning for graph-based clustering[J].Knowledge Based Systems,2019 163(JAN.1):510-517.
[19]KANG Z,NIE F P,WANG J,et al.Multiview Consensus Graph Clustering[J].IEEE Transactions on Image Processing,2019,28(3):1261-1270.
[20]YANG C,REN Z W,SUN Q S,et al.Joint Correntropy Metric Weighting and Block Diagonal Regularizer for Robust Multiple Kernel Subspace Clustering[J].Information Sciences,2019,500:48-66.
[1] 戴小路, 汪廷华, 周慧颖.
基于加权马氏距离的模糊多核支持向量机
Fuzzy Multiple Kernel Support Vector Machine Based on Weighted Mahalanobis Distance
计算机科学, 2022, 49(11A): 210800216-5. https://doi.org/10.11896/jsjkx.210800216
[2] 袁晓磊, 岳晓峰, 方博, 马国元.
基于点对特征及分层全连接聚类的三维目标识别方法
Three-dimensional Target Recognition Method Based on Pair Point Feature and HierarchicalComplete-linkage Clustering
计算机科学, 2021, 48(6A): 127-131. https://doi.org/10.11896/jsjkx.200800035
[3] 刘梦炀, 武利娟, 梁慧, 段旭磊, 刘尚卿, 高一波.
一种高精度LSTM-FC大气污染物浓度预测模型
A Kind of High-precision LSTM-FC Atmospheric Contaminant Concentrations Forecasting Model
计算机科学, 2021, 48(6A): 184-189. https://doi.org/10.11896/jsjkx.200600090
[4] 张杰, 白光伟, 沙鑫磊, 赵文天, 沈航.
基于时空特征的移动网络流量预测模型
Mobile Traffic Forecasting Model Based on Spatio-temporal Features
计算机科学, 2019, 46(12): 108-113. https://doi.org/10.11896/jsjkx.181102207
[5] 钟锐, 吴怀宇, 何云.
基于局部融合特征与分层增量树的快速人脸识别算法
Fast Face Recognition Algorithm Based on Local Fusion Feature and Hierarchical Incremental Tree
计算机科学, 2018, 45(6): 308-313. https://doi.org/10.11896/j.issn.1002-137X.2018.06.054
[6] 王铁建,吴飞,荆晓远.
基于多核字典学习的软件缺陷预测
Multiple Kernel Dictionary Learning for Software Defect Prediction
计算机科学, 2017, 44(12): 131-134. https://doi.org/10.11896/j.issn.1002-137X.2017.12.026
[7] 陈彤彤,丁昕苗,柳婵娟,邹海林,周树森,刘影.
一种基于示例非独立同分布的多示例多标签分类算法
Multi-instance Multi-label Learning Algorithm by Treating Instances as Non-independent Identically Distributed Samples
计算机科学, 2016, 43(2): 287-292. https://doi.org/10.11896/j.issn.1002-137X.2016.02.060
[8] 沈健,蒋芸,张亚男,胡学伟.
一种基于样本加权的多尺度核支持向量机方法
Novel Multi-scale Kernel SVM Method Based on Sample Weighting
计算机科学, 2016, 43(12): 139-145. https://doi.org/10.11896/j.issn.1002-137X.2016.12.025
[9] 李谦,景丽萍,于剑.
基于多核学习的投影非负矩阵分解算法
Multi-kernel Projective Nonnegative Matrix Factorization Algorithm
计算机科学, 2014, 41(2): 64-67.
[10] 王昕,刘颖,范九伦.
基于多核Fisher判别分析的人脸特征提取
Face Feature Extraction Based on Weighted Multiple Kernel Fisher Discriminant Analysis
计算机科学, 2012, 39(9): 262-265.
[11] .
大型复杂软件系统安全需求的体系结构模型

计算机科学, 2007, 34(12): 260-264.
[12] 左天军 朱智林 韩俊刚 陈平.
Java虚拟机动态类加载的形式化模型

计算机科学, 2005, 32(7): 209-213.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!