计算机科学 ›› 2014, Vol. 41 ›› Issue (10): 91-94.doi: 10.11896/j.issn.1002-137X.2014.10.021
刘超,庄连生,俞能海
LIU Chao,ZHUANG Lian-sheng and YU Neng-hai
摘要: 传统潜在语义分析模型所得到的主题空间映射矩阵往往比较稠密,不仅存储代价比较高,而且各个主题含义不明确。针对该问题,提出一种新的稀疏主题模型,该模型通过对映射矩阵施加稀疏性约束,使得每个主题只与少数词项关联,来增加主题的可解释性;同时,通过对编码系数矩阵施加低秩约束,使得数据在主题空间中呈现出更好的聚类特性。实验结果表明,基于该模型得到的主题空间更有利于分类,映射矩阵的存储代价更低。
[1] Dumais S T.Latent Semantic Analysis[J].Annual Review of Information Science and Technology,2005,38(1):188-230 [2] Deerwester S,Dumais S T,Furnas G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407 [3] Chen X,Qi Y,Bai B, et al.Sparse Latent Semantic Analysis[C]∥SIAM 2011 International Conference on Data Mining.2011 [4] Liu G,Lin Z,Yu Y.Robust subspace segmentation by low-rank representation[C]∥Proceedings of the 26th International Conference on Machine Learning.Haifa,Israel.Citeseer,2010 [5] Liu Guang-can,Lin Zhou-chen,Yan Shui-cheng,et al.RobustRecovery of Subspace Structures by Low-Rank Representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):2233-2246 [6] Zhuang Lian-sheng,Gao Hao-yuan,Lin Zhou-chen,et al.Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).June 2012 [7] Lin Z,Chen M,Wu L,et al.The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices[R].UIUC Technical Report UILU-ENG-09-2215.2009 [8] Candès E.Compressive sampling[C]∥Proceedings of the International Congressof Mathematicians.2006 [9] Candès E,Li X,Ma Y,et al.Robust principal component analysis[J].Journal of the ACM,2011,58(3) [10] http://people.csail.mit.edu/jrennie/20Newsgroups/ [11] http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/mu-lticlass.html#rcv1.multiclass [12] Chang C,Lin C.LIBSVM:a library for supportvector machines.Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm,2001 |
No related articles found! |
|