计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240900131-10.doi: 10.11896/jsjkx.240900131

• 大数据&数据科学 • 上一篇    下一篇

基于相关熵的多视角低秩矩阵分解和多视角数据聚类中的约束图学习

杜元花1, 陈盼1, 周楠2, 施开波2, 陈二阳2, 张远鹏3,4   

  1. 1 成都信息工程大学应用数学学院 成都 610225
    2 成都大学电子信息与电气工程学院 成都 610106
    3 南通大学医学信息学系 江苏 南通 226019
    4 香港理工大学健康科技与资讯学系 香港 999077
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 陈盼(victory_cp@163.com)
  • 作者简介:(duyuanhuaa@126.com)
  • 基金资助:
    国家自然科学基金(12101090);2023年成都信息工程大学科技创新能力增强计划创新团队重点项目(KYTD202322);江苏高校哲学社会科学研究项目(2023SJYB1680);四川省科学技术厅资助项目(2023NSFSC1425,2023NSFSC0071,2023NSESC1362)

Correntropy Based Multi-view Low-rank Matrix Factorization and Constraint Graph Learning for Multi-view Data Clustering

DU Yuanhua1, CHEN Pan1, ZHOU Nan2, SHI Kaibo2, CHEN Eryang2, ZHANG Yuanpeng3,4   

  1. 1 College of Applied Mathmatics,Chengdu University of Information Technology,Chengdu 610225,China
    2 School of Electronic Information and Electrical Engineering,Chengdu University,Chengdu 610106,China
    3 Department of Medical Informatics,Nantong University,Nantong,Jiangsu 226019,China
    4 Department of Health Technology and Informatics,The Hong Kong Polytechnic University,Hong Kong 999077,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:CHEN Pan,born in 1999,postgraduate.Her main research interests include machine learning,spectral clustering and low-rank modeling.
  • Supported by:
    National Natural Science Foundation of China(12101090),2023 Chengdu University of Information Technology Science and Technology Innovation Capability Enhancement Plan Innovation Team Key Project( KYTD202322),Research Projects in Philosophy and Social Sciences in Jiangsu Province(2023SJYB1680) and Sichuan Provincial Science and Technology Department(2023NSFSC1425,2023NSFSC0071,2023NSESC1362).

摘要: 目前大多数的多视角聚类方法都集中在无监督的学习场景上,它们不能利用数据中的标签信息。此外,它们还无法处理可能存在于数据中的异常值。为了解决这些问题,提出了一种基于相关熵的多视角低秩矩阵分解(CMLMF)的多视角数据半监督聚类方法。具体来说,采用一个约束矩阵引入标签信息,通过最大化相关熵准则来消除亲和矩阵和标签中异常值的影响。为了充分利用局部结构信息,还提出了一种基于相关熵的多视角约束图学习框架,自适应地提取隐藏在多视角数据中的局部结构。此外,提出了一种基于相关熵的多视角低秩矩阵分解(CMLMF)模型,该模型与自适应图学习框架相结合,以提取数据的全局重构信息。最后,设计了一种结合芬切尔共轭(FC)和块坐标更新(BCU)的有效优化算法来求解该模型。实验结果表明,与现有方法相比,CMLMF的准确性(ACC)、归一化互信息(NMI)和精度(Precision)有了很大的提高,其有效性得到验证。

关键词: 低秩矩阵分解, 半监督学习(SSL), 多视角聚类, 最大相关熵准则(MCC)

Abstract: Most of the current multi-view clustering methods focus on unsupervised learning scenarios,which cannot utilize the label information in the data.Furthermore,they could not handle the outliers,which may exist in the data.In order to address these issues,this paper proposes a correntropy based multi-view low-rank matrix factorization(CMLMF) method for multi-view data semi-supervised clustering.Specifically,a constraint matrix is used to introduce label information,removing the influence of outliers in the affinity matrix and labels by maximizing the correntropy criterion.In order to make full use of the local structure information,a multi-views constrained graph learning framework based on the correntropy is also proposed to adaptively extract the local structure hidden in the multi-view data.In addition,a multi-views low-rank matrix factorization(CMLMF) model based on correntropy is proposed,which is combined with an adaptive graph learning framework to extract the global reconstruction information of the data.Finally,an effective optimization algorithm combining fencher conjugate(FC) and block coordinate update(BCU) is designed to solve the model.Experimental results show that,compared with the existing methods,the accuracy(ACC),normalized mutual information,(NMI),and the accuracy(Precision) are greatly improved,which verifies the effectiveness of the algorithm.

Key words: Low-rank matrix factorization, Semi-supervised learning(SSL), Multi-view clustering, Maximum correntropy criterion(MCC)

中图分类号: 

  • TP181
[1]LEE D D,SEUNG H S.Learning the parts of objects by non-negative matrix factorization [J].Nature,1999,401(6755):788-791.
[2]UEDA M,NOMURA Y,MIYAO J,et al.Non-negative Matrix Factorization of a set of Economic Time Series with Graph Based Smoothing of Basis Vectors and Sparseness of the Coefficients[C]//2020 IEEE International Conference on Systems,Man,and Cybernetics(SMC).IEEE,2020:824-829.
[3]ZONG L,ZHANG X,ZHAO L,et al.Multi-view clustering via multi-manifold regularized non-negative matrix factorization [J].Neural Networks,2017,88:74-89.
[4] LI Z,LIU J,TANG X.Constrained clustering via spectral regularization[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:421-428.
[5]LIU H,TAO Z,FU Y.Partition level constrained clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(10):2469-2483.
[6]JING X Y,WU F,DONG X,et al.Semi-supervised multi-view correlation feature learning with application to webpage classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[7]NIE F,CAI G,LI X.Multi-view clustering and semi-supervised classification with adaptive neighbours[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[8]NIE F,CAI G,LI J,et al.Auto-weighted multi-view learning for image clustering and semi-supervised classification [J].IEEE Transactions on Image Processing,2017,27(3):1501-1511.
[9]YANG Z,ZHANG H,LIANG N,et al.Semi-supervised multi-view clustering by label relaxation based non-negative matrix factorization [J].The Visual Computer,2023,39(4):1409-1422.
[10] DU L,LI X,SHEN Y D.Robust Nonnegative Matrix Factorization via Half-Quadratic Minimization [C]//IEEE 12th International Conference on Data Mining.2012:201-210.
[11]GUAN N Y,LIU T,ZHANG Y,et al.Truncated Cauchy Non-Negative Matrix Factorization [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(1):246-259.
[12]PRÂINCIPE J C.Information Theoretic Learning:Renyi’s Entropy and Kernel Perspectives[M].Springer,2010:60-62.
[13]SAIN S R.The nature of statistical learning theory [J].Taylor &Francis,1997,38(4):409.
[14]GUO L Z,LI Y F.Hybrid semi-supervised learning with robust pseudo-label selection[J].Scientia Sinica Informationis,2024,54(3):623-637.
[15]COENEN M,SCHACK T,BEYER D,et al.ConsInstancy:learning instance representations for semi-supervised panoptic segmentation of concrete aggregate particles [J].Machine Vision and Applications,2022,33(4):57.
[16]ZHU X,GHAHRAMANI Z,LAFFERTY J D.Semi-supervised learning using gaussian fields and harmonic functions[C]//Proceedings of the 20th International Conference on Machine Learning(ICML-03).2003:912-919.
[17]TIAN Z,KUANG R.Integrative classification and analysis ofmultiple arrayCGH datasets with probe alignment [J].Bioinformatics,2010,26(18):2313-2320.
[18]NIE F,LI J,LI X.Parameter-free auto-weighted multiple graph learning:A framework for multiview clustering and semi-supervised classification[C]//IJCAI.2016:1881-1887.
[19]YUAN Y, LI X, WANG Q,et al.A semi-supervised learning algorithm via adaptive Laplacian graph [J].Neurocomputing,2021,426:162-173.
[20]GAO S,YU Z,JIN T,et al.Multi-view low-rank matrix factorization using multiple manifold regularization [J].Neurocompu-ting,2019,335:143-152.
[21]HANSEN P C.Truncated singular value decomposition solu-tions to discrete ill-posed problems with ill-determined numerical rank [J].SIAM Journal on Scientific and Statistical Computing,1990,11(3):503-518.
[22]HOTELLING H.Analysis of a complex of statistical variables into principal components [J].Journal of Educational Psycho-logy,1933,24(6):417.
[23]XU W,LIU X,GONG Y.Document clustering based on non-negative matrix factorization[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval.2003:267-273.
[24]MACLEAN W J,TSOTSOS J K.Fast pattern recognition using normalized grey-scale correlation in a pyramid image representation [J].Machine Vision and Applications,2008,19:163-179.
[25]SHEN Q,BAN X,LIU R,et al.Decay-weighted extreme learning machine for balance and optimization learning [J].Machine Vision and Applications,2017,28:743-753.
[26]FENN S,MENDES A,BUDDEN D M.Addressing the non-functional requirements of computer vision systems:a case study[J].Machine Vision and Applications,2016,27:77-86.
[27]STARCK J L,MURTAGH F,FADILI J M.Sparse image and signal processing:wavelets,curvelets,morphological diversity [M].Cambridge University Press,2010.
[28]CHENG H,LIU Z,YANG J.Sparsity induced similarity measure for label propagation[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:317-324.
[29]XU Y,YIN W.A block coordinate descent methodfor regulari-zed multiconvex optimization with applications to nonnegative tensor factorization and completion [J].SIAM Journal on Imaging Sciences,2013,6(3):1758-1789.
[30]ROCKAFELLAR R T.Second-order convex analysis [J].J.Nonlinear Convex Anal,1999,1(1-16):84.
[31]SHI H J M,TU S,XU Y,et al.A primer on coordinate descent algorithms [J].arXiv:1610.00040,2016.
[32]PENG Z,WU T,XU Y,et al.Coordinate friendly structures,algorithms and applications [J].arXiv:1601.00863,2016.
[33]LEE D,SEUNG H S.Algorithms for non-negative matrix factorization [J].Advances in Neural Information Processing Systems,2000,13.
[34]CAI D,HE X,HAN J,et al.Graph regularized nonnegative matrix factorization for data representation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,33(8):1548-1560.
[35]CAI D,HE X,HAN J.Locally consistent concept factorization for document clustering [J].IEEE Transactions on Knowledge and Data Engineering,2010,23(6):902-913.
[36]TAO H,HOU C,NIE F,et al.Scalable multi-view semi-supervised classification via adaptive regression [J].IEEE Transactions on Image Processing,2017,26(9):4283-4296.
[37]WANG H,YANG Y,LIU B.GMC:Graph-based multi-viewclustering [J].IEEE Transactions on Knowledge and Data Engineering,2019,32(6):1116-1129.
[38]ZHANG C,FU H,LIU S,et al.Low-rank tensor constrainedmulti-view subspace clustering[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1582-1590.
[39]PAN B,LI C,CHE H.Nonconvex low-rank tensor approximation with graph and consistent regularizations for multi-view subspace learning [J].Neural Networks,2023,161:638-658.
[40]JIA Y,LIU H,HOU J,et al.Multi-view spectral clustering tailored tensor low-rank representation [J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(12):4784-4797.
[41]HUANG A,WANG Z,ZHENG Y,et al.Embedding regularizer learning for multi-view semi-supervised classification [J].IEEE Transactions on Image Processing,2021,30:6997-7011.
[42]NENE S A,NAYAR S K,MURASE H.Columbia object image library(coil-20) :CUCS-006-96[R].Columbia University,1996.
[43]WANG S,GU X,LU J,et al.Unsupervised discriminant canonical correlation analysis for feature fusion[C]//2014 22nd International Conference on Pattern Recognition.IEEE,2014:1550-1555.
[44]CHEN S,ZHAO H,KONG M,et al.2D-LPP:A two-dimensionalextension of locality preserving projections [J].Neurocompu-ting,2007,70(4/5/6):912-921.
[45]LEE K C,HO J,KRIEGMAN D J.Acquiring linear subspaces for face recognition under variable lighting [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(5):684-698.
[46]ASUNCION A,NEWMAN D.UCI Machine Learning Repository [EB/OL].https://archive.ics.uci.edu/ml/.
[47]HUBERT L,ARABIE P.Comparing partitions journal of classification [J].Google Scholar,1985,2(1):193-218.
[48]XU W,GONG Y.Document clustering by concept factorization[C]//Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2004:202-209.
[49]LI Z X,LIN K D,GUO J H,et al.Safety evaluation model for transport vehicles based on internet of vehicles data[J].Journal of Nantong University(Natural Science Edition),2020,19(1):26-32.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!