计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 108-113.doi: 10.11896/jsjkx.190700112

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于余弦相似度的稀疏非负矩阵分解算法

周昌1,2, 李向利1,3, 李俏霖1, 朱丹丹1, 陈世莲1, 蒋丽榕1   

  1. 1 桂林电子科技大学数学与计算科学学院 广西 桂林541004
    2 桂林电子科技大学广西高校数据分析与计算重点实验室 广西 桂林541004
    3 桂林电子科技大学广西密码学与信息安全重点实验室 广西 桂林541004
  • 收稿日期:2019-07-17 修回日期:2019-10-18 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 李向利(lixiangli@guet.edu.cn)
  • 作者简介:2218080688@qq.com
  • 基金资助:
    国家自然科学基金(11961010,71561008);广西自然科学基金(2018GXNSFAA138169);广西密码学与信息安全重点实验室研究课题 (GCIS201708);广西自动检测技术与仪器重点实验室基金(YQ19111,YQ18112);广西大学生创新创业项目(201810595218)

Sparse Non-negative Matrix Factorization Algorithm Based on Cosine Similarity

ZHOU Chang1,2, LI Xiang-li1,3, LI Qiao-lin1, ZHU Dan-dan1, CHEN Shi-lian1, JIANG Li-rong1   

  1. 1 School of Mathematics and Computing Science,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China
    2 Guangxi Colleges and Universities Key Laboratory of Data Analysis and Computation,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China
    3 Guangxi Key Laboratory of Cryptography and Information Security,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China
  • Received:2019-07-17 Revised:2019-10-18 Online:2020-10-15 Published:2020-10-16
  • About author:ZHOU Chang,born in 1998,postgra-duate.His main research interests include image clustering and so on.
    LI Xiang-li,born in 1977,Ph.D,professor.Her main research interests include image clustering,non-negative matrix factorization,and optimization
  • Supported by:
    National Natural Science Foundation of China (11961010,71561008),Guangxi Natural Science Foundation (2018GXNSFAA138169),Guangxi Key Laboratory of Cryptography and Information Security (GCIS201708),Guangxi Key Laboratory of Automatic Testing Technology and Instruments (YQ19111,YQ18112) and Guangxi University Students Innovation and Entrepreneurship Project(201810595218)

摘要: 基本的非负矩阵分解应用于图像聚类时,对异常点的处理不够鲁棒,稀疏性较差。为了提高分解后的矩阵的稀疏性,在基本的非负矩阵分解算法中引入了L2,1范数,对基本的非负矩阵分解模型进行了改进,从而实现稀疏性,提升算法的性能。同时,为了降低各特征之间的关联,强化非负矩阵分解模型特征的独立性,引入了余弦相似度,提出了基于余弦相似度的稀疏非负矩阵分解算法。该算法在处理高维数据和提取特征方面具有显著优势,并且可提高算法在图像聚类中的辨别准确性。实验结果表明,所提算法在一系列评价指标上的效果优于传统的非负矩阵分解算法。

关键词: L2,1范数, 非负矩阵分解, 图像聚类, 余弦相似度

Abstract: When the basic non-negative matrix factorization is applied to image clustering,the processing of abnormal points is not robust enough and the sparsity is poor.In order to improve the sparsity of the factorized matrix,the L2,1 norm is introduced into the basic non-negative matrix factorization,and the basic non-negative matrix factorization model is improved to achieve sparsity and improve the performance of the algorithm.At the same time,in order to reduce the correlation between the features and enhance the independence of the features of the non-negative matrix factorization model,the cosine similarity is introduced,and a sparse non-negative matrix factorization algorithm based on cosine similarity is proposed.The algorithm has significant advantages in high-dimensional data processingand feature extraction,and can improve the discrimination accuracy of the algorithm in ima-ge clustering.The experimental results show that the proposed new algorithm outperforms the traditional non-negative matrix factorization algorithm in a series of evaluation indicators.

Key words: Cosine similarity, Image clustering, L2,1 norm, Non-negative matrix factorization

中图分类号: 

  • O235
[1]PAATERO P,TAPPER U.Positive matrix factorization:a nonnegative factor model with optimal utilization of error estimates of data values [J].Environmetrics,2010,5(2):111-126.
[2]LEE D D,SEBASTIAN S H.Learning the parts of objects bynonnegative matrix factorization [J].Nature,1999,401:788 -791.
[3]JOLLIFFE I T.Principal component analysis[M].Springer-Verlag,2005.
[4]DRAPER B A,BEAK K,BARTLETT M S,et al.Re cognizing faces with pca and ica [J].Computer Vision\Image Unders tanding,2003,91(1):115-137.
[5]GERSHO A,GRAY R,GERSHO A,et al.Vector quantization and signal compression[J].Springer International,1992,159(1):407-485.
[6]LIAO Q,ZHANG Q.Local coordinate based graph-regularized NMF for image representation[J].Signal Processing,2016,124:103-114.
[7]EGGERT J,WERSING H,KORNER E.Transformation-invariant repressentation and NMF[C]//2004 IEEE International Joint Conference on Neural Networks,2004.IEEE,2004.
[8]WANG J,YANG D.Sparse non-negative matrix factorizationalgorithm based on proximal alternating linearized minimiztion[J].Computer Engineering,2019,45(2):220-225,232.
[9]HAN S Q,JIA R.K-Means clustering algorithm based on non-negative matrix factorization with sparseness constraints[J].Journal of Data Acquisition and Processing,2017,32(6):1216-1222.
[10]LU Y W.Research on manifold reression and non-negative matrix factorization in image classification[D].Harbin:Harbin Institute of Technology,2015.
[11]PANG M.Matrix factorization and its application ressearch on image classification and clustering[D].Dalian:Dalian University of Technology,2015.
[12]TAO X L,YU L,WANG X Y.One method based on non-negative matrix factorization and fuzzy C means for image clustering[J].Information Technology and Network Security,2019,38(3):44-48.
[13]XI S N.Research on image recognition based on semi-supervised non-negative matrix Factorization[D].Xi’an:Xidian University,2017.
[14]DU S Q,SHI Y Q,WANG W N.Manifold regularized-based discrimi nant concept factorization[J].Journal of Shandong University,2013,48(5):63-69.
[15]HE M J.Research on the Multi-view clustering algorithmsbased on NMF[D].Chengdu:Southwest Jiaotong University,2017.
[16]JIANG X Y.Research on image classification algorithm based on non-negative matrix factorization[D].Jinzhou:Liaoning University of Technology,2017.
[17]LEE D D.SEUNG H S.Learning the parts of objects by non-negative matrix factorization[J].Nature,2001,401(6755):788-791.
[18]KONG D,DING C,HUANG H.Robust nonnegative matrix factorization using L21-norm[C]//Proceedings of CIKM.2011:673-682.
[19]TAN P N,STEINBACH M,KUMAR V.Introduction to data mining [M].Toronto:Addison-Wesley Longman,2005:20-23.
[20]OCHIAI A.Zoogeographical Studies on the Soleoid FishesFound in Japan and its Neighbouring Regions-III[J].Nippon Suisan Gakkaishi,1957,22(9):531-535.
[21]HUANG H,LIU Y J,XIE Q.An Expert Finding Method for CQA Service Based on “User-Tag” Heterogeneous Network[J/OL].Computer Engineering,2019:1-6.https://doi.org/10.19678/j.issn.1000-3428.0053734.
[22]HUANG W C,ZHAO Y,XIONG L Y.Sparse non-negative Matrix factorization algorithm for independent feature learning[J].Application Research of Computers:2019,37(4):1-5.
[23]YU J L,LI X L,ZHAO P F.Sparsenon-negative matrix factorization based on kernel and hypergraph regularization[J].Journal of Computer Applications,2019,39(3):742-749.
[1] 韩洁, 陈俊芬, 李艳, 湛泽聪.
基于自注意力的自监督深度聚类算法
Self-supervised Deep Clustering Algorithm Based on Self-attention
计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001
[2] 官铮, 邓扬琳, 聂仁灿.
光谱重建约束非负矩阵分解的高光谱与全色图像融合
Non-negative Matrix Factorization Based on Spectral Reconstruction Constraint for Hyperspectral and Panchromatic Image Fusion
计算机科学, 2021, 48(9): 153-159. https://doi.org/10.11896/jsjkx.200900054
[3] 陈扬, 王金亮, 夏炜, 杨颢, 朱润, 奚雪峰.
基于特征自动提取的足迹图像聚类方法
Footprint Image Clustering Method Based on Automatic Feature Extraction
计算机科学, 2021, 48(6A): 255-259. https://doi.org/10.11896/jsjkx.200900033
[4] 段菲, 王慧敏, 张超.
面向数据表示的Cauchy非负矩阵分解
Cauchy Non-negative Matrix Factorization for Data Representation
计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195
[5] 李雨蓉, 刘杰, 刘亚林, 龚春叶, 王勇.
面向语音分离的深层转导式非负矩阵分解并行算法
Parallel Algorithm of Deep Transductive Non-negative Matrix Factorization for Speech Separation
计算机科学, 2020, 47(8): 49-55. https://doi.org/10.11896/jsjkx.190900202
[6] 李向利, 贾梦雪.
基于预处理的超图非负矩阵分解算法
Nonnegative Matrix Factorization Algorithm with Hypergraph Based on Per-treatments
计算机科学, 2020, 47(7): 71-77. https://doi.org/10.11896/jsjkx.200200106
[7] 王成章, 白晓明, 杜金栗.
图像的扩散界面无监督聚类算法
Diffuse Interface Based Unsupervised Images Clustering Algorithm
计算机科学, 2020, 47(5): 149-153. https://doi.org/10.11896/jsjkx.190300125
[8] 王丽星, 曹付元.
基于Huber损失的非负矩阵分解算法
Huber Loss Based Nonnegative Matrix Factorization Algorithm
计算机科学, 2020, 47(11): 80-87. https://doi.org/10.11896/jsjkx.190900144
[9] 任雪婷, 赵涓涓, 强彦, Saad Abdul RAUF, 刘继华.
联合成对学习和图像聚类的无监督肺癌亚型识别
Lung Cancer Subtype Recognition with Unsupervised Learning Combining Paired Learning and Image Clustering
计算机科学, 2020, 47(10): 200-206. https://doi.org/10.11896/jsjkx.190900073
[10] 康林瑶, 唐兵, 夏艳敏, 张黎.
基于GPU加速和非负矩阵分解的并行协同过滤推荐算法
GPU-accelerated Non-negative Matrix Factorization-based Parallel Collaborative Filtering Recommendation Algorithm
计算机科学, 2019, 46(8): 106-110. https://doi.org/10.11896/j.issn.1002-137X.2019.08.017
[11] 何孝文, 胡一飞, 王海平, 陈默.
在线学习非负矩阵分解
Online Learning Nonnegative Matrix Factorization
计算机科学, 2019, 46(6A): 473-477.
[12] 黄梦婷, 张灵, 姜文超.
基于流形正则化的多类型关系数据联合聚类方法
Multi-type Relational Data Co-clustering Approach Based on Manifold Regularization
计算机科学, 2019, 46(6): 64-68. https://doi.org/10.11896/j.issn.1002-137X.2019.06.008
[13] 黄梦婷, 张灵, 姜文超.
基于非负矩阵分解的短文本特征扩展与分类
Short Text Feature Expansion and Classification Based on Non-negative Matrix Factorization
计算机科学, 2019, 46(12): 69-73. https://doi.org/10.11896/jsjkx.190400107
[14] 贾旭, 孙福明, 李豪杰, 曹玉东.
基于有监督双正则NMF的静脉识别算法
Vein Recognition Algorithm Based on Supervised NMF with Two Regularization Terms
计算机科学, 2018, 45(8): 283-287. https://doi.org/10.11896/j.issn.1002-137X.2018.08.051
[15] 于晓,聂秀山,马林元,尹义龙.
基于短空时变化的鲁棒视频哈希算法
Robust Video Hashing Algorithm Based on Short-term Spatial Variations
计算机科学, 2018, 45(2): 84-89. https://doi.org/10.11896/j.issn.1002-137X.2018.02.014
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!