计算机科学 ›› 2019, Vol. 46 ›› Issue (6): 64-68.doi: 10.11896/j.issn.1002-137X.2019.06.008
黄梦婷, 张灵, 姜文超
HUANG Meng-ting, ZHANG Ling, JIANG Wen-chao
摘要: 随着大数据应用的发展,通过非线性流形采样得到的多类型关系数据规模越来越大,数据几何结构更加复杂,异构关系数据变得异常稀疏,导致数据挖掘难度增大且准确率降低。针对上述问题,提出一种基于流形非负矩阵三分解的多类型关系数据联合聚类方法:首先,对于较小规模的实体,根据其自然关系或内容相关性构造关联矩阵,对其分解后得到该类实体的聚类指示矩阵,将其作为非负矩阵三分解的输入;然后,在快速非负矩阵三分解(FNMTF)的基础上加入流形正则化处理,实现数据类型间关系与类型内部关系的联合聚类,进一步提高聚类的准确率。实验表明:在准确率和整体性能方面,流形非负矩阵三分解算法优于传统的基于非负矩阵分解的联合聚类算法。
中图分类号:
[1]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326. [2]BELKIN M,NIYOGI P.Laplacian eigenmaps for dimensionality reduction and data representation [J].Neural Computation,2003,15(6):1373-1396. [3]AILEM M,ROLE F,NADIF M.Co-clustering document-term matrices by direct maximization of graph modularity[C]∥ACM International on Conference on Information and Knowledge Management.New York:ACM Press,2015:1807-1810. [4]HONDA K,TANAKA D,NOTSU A.Incremental algorithms for fuzzy co-clustering of very large cooccurrence matrix[C]∥IEEE International Conference on Fuzzy Systems.Piscataway:IEEE Press,2014:2494-2499. [5]LEE D D,SEUNG H S.Learning the parts of objects with nonnegative matrix factorization[J].Nature,1999,401(21):788-791. [6]LEE D D,SEUNG H S.Algorithms for non-negative matrix factorization[C]∥Neural Information Processing Systems.New York:NIPC Press 2000:535-541. [7]DING C,HE X,SIMON H D,et al.On the equivalence of nonnegative matrix factorization and spectral clustering[C]∥SIAM International Conference on Data Mining.Philadelphia:SIAM Press,2005:606-610. [8]DING C,LI T,PENG W,et al.Orthogonal nonnegative matrix tri-factorizations for clustering[C]∥ACM SIGKDD Internatio-nal Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2006:126-135. [9]LI Z,WU X.Weighted nonnegative matrix tri-factorization for co-clustering[C]∥IEEE International Conference on TOOLS with Artificial Intelligence.Piscataway:IEEE Press,2011:811-816. [10]BUONO N D,PIO G.Non-negative Matrix Tri-Factorization for co-clustering:An analysis of the block matrix[J].Information Sciences,2015,301(20):13-26. [11]GU Q,ZHOU J.Co-clustering on manifolds[C]∥ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2009:359-368. [12]WANG S,HUANG A.Penalized nonnegative matrix tri-factorization for co-clustering[J].Expert Systems with Applications,2017,78(C):64-73. [13]WANG S,GUO W.Robust co-clustering via dual local learning and high-order matrix factorization[J].Knowledge-Based Systems,2017,138(15):176-187. [14]WANG H,NIE F,HUANG H,et al.Fast nonnegative matrix tri-factorization for large-scale data co-clustering[C]∥International Joint Conference on Artificial Intelligence.Menlo Park:AAAI Press,2011:1553-1558. [15]SHEN G,YANG W,WANG W,et al.Large-scale heteroge-neous data co-clustering based on nonnegative matrix factorization[J].Journal of Computer Research and Development,2016,53(2):459-466.(in Chinese) 申国伟,杨武,王巍,等.基于非负矩阵分解的大规模异构数据联合聚类[J].计算机研究与发展,2016,53(2):459-466. |
[1] | 官铮, 邓扬琳, 聂仁灿. 光谱重建约束非负矩阵分解的高光谱与全色图像融合 Non-negative Matrix Factorization Based on Spectral Reconstruction Constraint for Hyperspectral and Panchromatic Image Fusion 计算机科学, 2021, 48(9): 153-159. https://doi.org/10.11896/jsjkx.200900054 |
[2] | 段菲, 王慧敏, 张超. 面向数据表示的Cauchy非负矩阵分解 Cauchy Non-negative Matrix Factorization for Data Representation 计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195 |
[3] | 李雨蓉, 刘杰, 刘亚林, 龚春叶, 王勇. 面向语音分离的深层转导式非负矩阵分解并行算法 Parallel Algorithm of Deep Transductive Non-negative Matrix Factorization for Speech Separation 计算机科学, 2020, 47(8): 49-55. https://doi.org/10.11896/jsjkx.190900202 |
[4] | 李向利, 贾梦雪. 基于预处理的超图非负矩阵分解算法 Nonnegative Matrix Factorization Algorithm with Hypergraph Based on Per-treatments 计算机科学, 2020, 47(7): 71-77. https://doi.org/10.11896/jsjkx.200200106 |
[5] | 王丽星, 曹付元. 基于Huber损失的非负矩阵分解算法 Huber Loss Based Nonnegative Matrix Factorization Algorithm 计算机科学, 2020, 47(11): 80-87. https://doi.org/10.11896/jsjkx.190900144 |
[6] | 周昌, 李向利, 李俏霖, 朱丹丹, 陈世莲, 蒋丽榕. 基于余弦相似度的稀疏非负矩阵分解算法 Sparse Non-negative Matrix Factorization Algorithm Based on Cosine Similarity 计算机科学, 2020, 47(10): 108-113. https://doi.org/10.11896/jsjkx.190700112 |
[7] | 康林瑶, 唐兵, 夏艳敏, 张黎. 基于GPU加速和非负矩阵分解的并行协同过滤推荐算法 GPU-accelerated Non-negative Matrix Factorization-based Parallel Collaborative Filtering Recommendation Algorithm 计算机科学, 2019, 46(8): 106-110. https://doi.org/10.11896/j.issn.1002-137X.2019.08.017 |
[8] | 何孝文, 胡一飞, 王海平, 陈默. 在线学习非负矩阵分解 Online Learning Nonnegative Matrix Factorization 计算机科学, 2019, 46(6A): 473-477. |
[9] | 黄梦婷, 张灵, 姜文超. 基于非负矩阵分解的短文本特征扩展与分类 Short Text Feature Expansion and Classification Based on Non-negative Matrix Factorization 计算机科学, 2019, 46(12): 69-73. https://doi.org/10.11896/jsjkx.190400107 |
[10] | 贾旭, 孙福明, 李豪杰, 曹玉东. 基于有监督双正则NMF的静脉识别算法 Vein Recognition Algorithm Based on Supervised NMF with Two Regularization Terms 计算机科学, 2018, 45(8): 283-287. https://doi.org/10.11896/j.issn.1002-137X.2018.08.051 |
[11] | 郑红,邓文轩,邓晓,卢兴见. 基于矩阵的工作流逻辑网模型的化简及验证 Simplification and Verification of Matrix-based Workflow Logic Net Model 计算机科学, 2018, 45(7): 307-314. https://doi.org/10.11896/j.issn.1002-137X.2018.07.052 |
[12] | 于晓,聂秀山,马林元,尹义龙. 基于短空时变化的鲁棒视频哈希算法 Robust Video Hashing Algorithm Based on Short-term Spatial Variations 计算机科学, 2018, 45(2): 84-89. https://doi.org/10.11896/j.issn.1002-137X.2018.02.014 |
[13] | 邹丽, 蔡希彪, 孙静, 孙福明. 基于双图正则的半监督NMF混合像元解混算法 Hyperspectral Unmixing Algorithm Based on Dual Graph-regularized Semi-supervised NMF 计算机科学, 2018, 45(12): 251-254. https://doi.org/10.11896/j.issn.1002-137X.2018.12.041 |
[14] | 杨美姣,刘惊雷. 基于Nystrm采样和凸NMF的偏好聚类 Preference Clustering Based on Nystrm Sampling and Convex-NMF 计算机科学, 2018, 45(1): 55-61. https://doi.org/10.11896/j.issn.1002-137X.2018.01.008 |
[15] | 李锋,万小强. 基于关联矩阵的短信自动分类 SMS Automatic Classification Based on Relational Matrix 计算机科学, 2017, 44(Z6): 428-432. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.096 |
|