Computer Science ›› 2015, Vol. 42 ›› Issue (5): 119-123.doi: 10.11896/j.issn.1002-137X.2015.05.024

Previous Articles     Next Articles

Project Topic Model Construction Based on Semi-supervised Graph Clustering

SHI Lin-bin, YU Zheng-tao, YAN Xin, SONG Hai-xia and HONG Xu-dong   

  • Online:2018-11-14 Published:2018-11-14

Abstract: The quality of project topic model has a direct impact on recommended effect of the follow-up evaluation experts.In order to effectively exploit the association relationships among project document fragments to analyze project topics,we proposed a project topic model construction method based on semi-supervised graph clustering.We first analyzed structural characteristics of project documents to extract project name,project keywords and other structural information that responds project topics.Combined with expert evidence documents,expert topic relationship networks and other external resources which can indicate expert topics,we defined and extracted the association relationship features among project document fragments.Then,we used different association relationships to calculate correlation among project document fragments and built undirected graph model for project document fragments.Finally,using the marked association relationship features as supervised information for clustering,we applied semi-supervised graph clustering algorithm to cluster for project document fragments to realize the construction of the project topic model.The comparative experimental results of project topic extraction verify the effectiveness of the proposed method.Structural features of the project documents,expert evidence documents and expert topic relationship networks have certain guidance function for the construction of the project topic model.

Key words: Topic model,Semi-supervised graph clustering,Association relationship features,Evaluation experts recommendation

[1] 许云红.基于网络方法的专家知识推荐[D].安徽:中国科学技术大学,2010
[2] 徐戈,王厚峰.自然语言处理中主题模型的发展[J].计算机学报,2011,34(8):1423-1436
[3] Blei D M,Lafferty J D.Dynamic topic models[C]∥Proceedings of the 23rd International Conference on Machine Learning.New York,USA:ACM,2006:113-120
[4] Chong Wang,Bo T,Christopher M,et al.Markov Topic Models[C]∥Proceedings of the 12th International Conference on Artificial Intelligence and Statistics.Clearwater Beach,USA,2009:583-590
[5] 孙艳,周学广,付伟.基于主题情感混合模型的无监督文本情感分析[J].北京大学学报:自然科学版,2013,49(1):102-108
[6] Blei D,McAuliffe J.Supervised topic models[C]∥Advances in Neural Information Processing Systems(NIPS).Vancouver,Canada,2008
[7] Li Wen-bo,Sun Le,Zhang Da-kun.Text classification based on labeled-LDA model[J].Chinese Journal of Computers,2008,31(4):620-627
[8] 江雨燕,李平,王清.基于共享背景主题的 LabeledLDA模型[J].电子学报,2013,41(9):1794-1799
[9] Ville H T,Henry T.Combining Topic Models and Social Networks for Chat Data Mining[C]∥IEEE/WIC/ACM International Conference on Web Intelligence.Los Alamitos,USA:IEEE Computer Society Press,2004:206-213
[10] Tan Xu,Douglas W O.Wikipedia-based Topic Clustering forMicroblogs[J].American Society for Information Science and Technology,2011,48(1):1-10
[11] Wagstaff K,Cardie C.Clustering with instance-level constraints[C]∥Proceedings of the 17rd international conference on Machine learning.Morgan Kaufmann,2000:1103-1110
[12] Brian K,Sugato B,Inderjit S D,et al.Semi-supervisedgraph clustering:a kernel approach[J].Machine Learning,2009,74(1):1-22
[13] Kass R,Wasserman L.A reference Bayesian test for nested hypotheses and its relationship to the Schwarzcriterion[J].Journa1 of the American Statistica1 Association,1995(10):928-934
[14] 郑苗苗,吉根林.一种基于密度的分布式聚类算法[J].南京大学学报,2008,44(5):536-543
[15] 刘群,李素建.基于《知网》的词汇语义相似度计算[C]∥第三届汉语词汇语义学研讨会.台北,2002

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!