Computer Science ›› 2014, Vol. 41 ›› Issue (3): 223-227.

Previous Articles     Next Articles

Labeled-LDA Text Classification Algorithm Based on Graph Model for “Central Topic Oblivion Problem”

LI Wei,MA Yong-zheng and SHEN Yi   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Latent Dirichlet Allocation(LDA) is an unsupervised topic model used to mining potential topic information from the corpus.Labeled-LDA as a mutation of LDA can be used to do multi-classification on labeled documents,which establishes the one-to-one mapping from topic to label and learns the relationship between words and labels.Recently,the application of graph model has obtained good results in text mining,which provides a new way to analyze semantics of documents.This paper proposed a new method combining complex network theory and Labeled-LDA to do text classification.The experimental results show that our new method gets an improvement according to Macro_F1compared to the traditional LDA model.

Key words: Text classification,Graph mining,Graph model,LDA

[1] 苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17:1848-1859
[2] Chen L,Tokuda N,Nagai A.A new differential LSI space-based probabilistic document classifier[J].Information Processing Letters,2003,88(5):203-212
[3] Hofmann T.Probabilistic Latent Semantic Indexing[C]∥SI-GIR.1999:50-57
[4] 李文波,孙乐,张大鲲.基于Labeled-LDA模型文本分类新算法[J].计算机学报,2008,31:620-627
[5] Blei D,Ng A,Jordan M.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3:993-1022
[6] Ramage D,Hall D,Nallapati R,et al.Labeled LDA:A supervised topic model for credit attribution in multi-labeled corpora[C]∥Proceedings of the 2009Conference on Empirical Methods in Natural Language Processing.August 2009:248-256
[7] 黄云平,孙乐,李文波.基于上下文图模型文本表示的文本分类研究[C]∥第四届全国信息检索与内容安全学术会议论文集(上).2008
[8] 赵鑫,李晓明.主题模型在文本挖掘中的应用[R].PKU-CS-NCIS-TR2011XX.June 2011
[9] Griffiths T L,Steyvers M.Finding scientific topics[C]∥Proceedings of the National Academy of Sciences.April 2004,1:5228-5235
[10] Griffiths T.Gibbs sampling in the generative model of Latent Dirichlet Allocation.http://people.cs.umass.edu/~wallach/courses/s11/cmpsci791ss/readings/griffithso2gibbs.pdf
[11] Chang C-C,Lin C-J.LIBSVM:a library for support vector machines.http://www.csie.ntu.edu.tw/~cjlin/libsvm,2001
[12] Blei D M.Probabilistic topic models[J].Communications of the ACM,2012,5:77-84
[13] Blei D M,McAuliffe J D.Supervised topic models[C]∥NIPS.2007
[14] Cancho R F I,Sole R V.The small world of human language[J].Proceedings of The Royal Society of London B:Biological Sciences,2001,8(1482):2261-2265
[15] Salton G,Buckley C.Term-weighting approaches in automatic text retrieval[J].Information Processing & Management,1988,4(5):513-523

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!