Computer Science ›› 2021, Vol. 48 ›› Issue (6): 96-102.doi: 10.11896/jsjkx.200700195

• Database & Big Data & Data Science • Previous Articles     Next Articles

Cauchy Non-negative Matrix Factorization for Data Representation

DUAN Fei1,2, WANG Hui-min1, ZHANG Chao1,2   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Computational Intelligence and Chinese Information Processing(Shanxi University),Ministry of Education,Taiyuan 030006,China
  • Received:2020-07-30 Revised:2020-10-05 Online:2021-06-15 Published:2021-06-03
  • About author:DUAN Fei,born in 1979,Ph.D,lectu-rer,is a member of China Computer Fe-deration.His main research interests include computer vision and machine learning.
  • Supported by:
    National Natural Science Foundation of China(61806116,61976128,61972238,61802237),Key R&D program of Shanxi Province(International Cooperation,201903D421041),Natural Science Foundation of Shanxi Province(201801D221175,201901D111035,201901D211176),Cultivate Scientific Research Excellence Programs of Higher Education Institutions in Shanxi(CSREP)(2019SK036),Training Program for Young Scientific Researchers of Higher Education Institutions in Shanxi,Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi(STIP)(2019L0066),Industry-University-Research Collaboration Program Between Shanxi University and Xiaodian District and the 1331 Engineering Project of Shanxi Province,China.

Abstract: As an important matrix factorization model,non-negative matrix factorization(NMF) is widely used in the fields of data mining and machine learning.It is often used to extract low-dimensional,sparse,and meaningful features from a collection of non-negative data vectors.Although effective in some scenarios,standard NMF utilizes squared Frobenius norm to quantify the reconstruction residual,which make it very sensitive to non-gaussian noises and outliers.As many real data inevitably contain various kind of noises,it is desirable to use robust version of NMF which is insensitive to non-gaussian noise and outliers.This paper proposes to use the Cauchy function to measure the quality of approximation instead of squared Euclidean distance for each sample and take the dependencies between different feature dimensions into account.Based on the theory of half-quadratic programming,this paper derives multiplicative updating rules to solve the proposed model effectively.To verify the effectiveness of the proposed approach,extensive unsupervised clustering experiments are conducted on several benchmark face image datasets.The experimental results show that the proposed model is robust to head poses variations,lighting,and emotion changes.Further,our model achieves consistently good performance when the parameter c varies in a large range on all three benchmark datasets.

Key words: Cauchy loss, Clustering, Data representation, Nonnegative matrix factorization

CLC Number: 

  • TP391
[1]LEE D,SEUNG H S.Algorithms for Non-negative Matrix Factorization [C]//Advances in Neural Information Processing System.2001:556-562.
[2]PAATERO P,TAPPER U.Positive matrix factorization:a non-negative factor model with optimal utilization of error estimates of data values[J].Environmetrics,1994,5(2):111-126.
[3]LEE D,SEUGN H S.Learning the parts of objects by non-negative matrix factorization[J].Nature,1999,401:788-791.
[4]LI S,HOU X,ZHANG H.et al.Learning spatially localized,parts-based representation[C]//International Conference on Computer Vision and Pattern Recognition.2001:207-212.
[5]WANG H,HUANG H,DING C,et al.Predicting protein-protein interactions from multimodal biological data sources via non-negative matrix factorization[C]//Proceedings of International Conference on Research in Computational Molecular Bio-logy.2012:774-783.
[6]COOPER M,FOOTE J.Summarizing video using non-negativesimilarity matrix factorization[C]//Proceedings of IEEE Workshop on Multimedia Signal Processing.2002:25-28.
[7]WANG H,HUANG H,DING C.Cross-language web page classification via joint non-negative matrix tri-factorization based dyadic knowledge transfer[C]//Proceedings of Annual ACM SIGIR Conference.2011:933-942.
[8]XU W,LIU X,GONG Y.Document clustering based on non-negative matrix factorization[C]//Proceedings of the 26th ACM SIGIR.2003:267-273.
[9]LI J,ZHOU G,QIU Y,et al.Deep graph regularized non-negative matrix factorization for multi-view clustering[J].Neurocomputing,2020,390:108-116.
[10]LIANG N,YANG Z,LI Z,et al.Semi-supervised multi-viewclustering with graph-regularized partially shared non-negative matrix factorization[J].Knowledge-Based Systems,2020,190:1-10.
[11]CHEN G,CHEN X,WANG J,et al.Robust non-negative matrix factorization for link prediction in complex networks using manifold regularization and sparse learning[J].Physica A:Statistical Mechanics and its Applications,2020,539:1-18.
[12]YE W,WANG H,YAN S,et al.Nonnegative matrix factorization for clustering ensemble based on dark knowledge[J].Knowledge-Based Systems,2019,163:624-631.
[13]KONG D,DING C,HUANG H.Robust nonnegative matrix factorization using l21-norm[C]//Proceedings of the 20th ACM International Conference on Information and Knowledge Mana-gement.2015:871-880.
[14]LAM E Y.Non-negative matrix factorization for images withLaplacian noise[C]//Proceedings IEEE Asia Pacific Conference on Circuits and Systems.2008:798-801.
[15]GUAN N,TAO D,LUO Z,et al.MahNMF:Manhattan non-negative matrix factorization [EB/OL].(2012-07-14)[2012-07-14].https://arxiv.org/abs/1207.3438.
[16]NESTEROV Y.Smooth minimization of non-smooth functions[J].Mathematical Programming,2005,103(1):127-152.
[17]ZHANG L,CHEN Z,ZHENG M,et al.Robust non-negativematrix factorization[J].Frontiers of Electrical and Electronic Engineering,2011,6(2):192-200.
[18]DU L,LI X,SHEN Y.Robust nonnegative matrix factorization via half-quadratic minimization[C]//Proceedings of the 12th International Conference on Data Mining.2012:201-210.
[19]GUAN N,LIU T,ZHANG Y,et al.Truncated Cauchy non-negative matrix factorization[J].IEEE Transactions on PatternAnalysis and Machine Intelligence,2019,41(1):246-259.
[20]NAGY F.Parameter estimation of the Cauchy distribution in information theory approach[J].Journal of Universal Computer Science,2006,12(9):1332-1344.
[21]NIKOLOVA M,NG M.Analysis of half-quadratic minimization methods for signal and image recovery[J].SIAM Journal onScientific Computing,2006,27(3):937-966.
[22]NIKOLOVA M,CHAN R.The equivalence of half-quadraticminimization and the gradient linearization iteration[J].IEEE Transactions on Image Processing,2007,16(6):1623-1627.
[23]NIE F,DING C,LUO D,et al.Improved minmax cut graph clustering with nonnegative relaxation[C]//Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases.2010:451-466.
[24]NIE F,XU D,TSANG I,et al.Spectral embedded clustering[C]//Proceedings of International Joint Conferences on Artificial Intelligence.2009:1181-1186.
[1] LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients [J]. Computer Science, 2022, 49(9): 183-193.
[2] CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[3] YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[4] MAO Sen-lin, XIA Zhen, GENG Xin-yu, CHEN Jian-hui, JIANG Hong-xia. FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition [J]. Computer Science, 2022, 49(6A): 285-290.
[5] CHEN Jing-nian. Acceleration of SVM for Multi-class Classification [J]. Computer Science, 2022, 49(6A): 297-300.
[6] Ran WANG, Jiang-tian NIE, Yang ZHANG, Kun ZHU. Clustering-based Demand Response for Intelligent Energy Management in 6G-enabled Smart Grids [J]. Computer Science, 2022, 49(6): 44-54.
[7] CHEN Jia-zhou, ZHAO Yi-bo, XU Yang-hui, MA Ji, JIN Ling-feng, QIN Xu-jia. Small Object Detection in 3D Urban Scenes [J]. Computer Science, 2022, 49(6): 238-244.
[8] XING Yun-bing, LONG Guang-yu, HU Chun-yu, HU Li-sha. Human Activity Recognition Method Based on Class Increment SVM [J]. Computer Science, 2022, 49(5): 78-83.
[9] ZHU Zhe-qing, GENG Hai-jun, QIAN Yu-hua. Line-Segment Clustering Algorithm for Chemical Structure [J]. Computer Science, 2022, 49(5): 113-119.
[10] ZHANG Yu-jiao, HUANG Rui, ZHANG Fu-quan, SUI Dong, ZHANG Hu. Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization [J]. Computer Science, 2022, 49(5): 165-169.
[11] ZUO Yuan-lin, GONG Yue-jiao, CHEN Wei-neng. Budget-aware Influence Maximization in Social Networks [J]. Computer Science, 2022, 49(4): 100-109.
[12] YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia. Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding [J]. Computer Science, 2022, 49(3): 121-128.
[13] HAN Jie, CHEN Jun-fen, LI Yan, ZHAN Ze-cong. Self-supervised Deep Clustering Algorithm Based on Self-attention [J]. Computer Science, 2022, 49(3): 134-143.
[14] PU Shi, ZHAO Wei-dong. Community Detection Algorithm for Dynamic Academic Network [J]. Computer Science, 2022, 49(1): 89-94.
[15] ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou. Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index [J]. Computer Science, 2022, 49(1): 121-132.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!