Computer Science ›› 2017, Vol. 44 ›› Issue (5): 226-231.doi: 10.11896/j.issn.1002-137X.2017.05.040

Previous Articles     Next Articles

Building Hierarchical Topic Based on Heterogeneous Chinese Online Encyclopedia

WANG Xu-zhong, LIU Yan, HU Lin-mei and CHEN Jing   

  • Online:2018-11-13 Published:2018-11-13

Abstract: Chinese online encyclopedia carries a huge amount of high quality information.Previous studies have utilized it for different knowledge acquisition tasks.For instance,the articles with similar subjects are grouped together into ca-tegories.Constructing a certain category topical hierarchy from the online encyclopedia is significantly beneficial for many applications such as search and browsing,information organizing and information retrieval.However,no attempts have been made to explore topic hierarchy of given category in online encyclopedia.Considering most of the online encyclopedia is heterogeneous and rough,this paper proposed a novel scheme of constructing topic hierarchy based on the Bayesian network.This scheme will incorporate both the structured contents table and unstructured text descriptions in the articles of the same category into automatic topic hierarchy learning for the online encyclopedia category using the algorithm of maximum spanning tree on the Bayesian topic network.Experimental results show that,compared with the existed encyclopedia topical hierarchy,our approach expand the content of 4 times while maintaining the accuracy of 75%.

Key words: Chinese online encyclopedia,Topic hierarchy,Structured contents table,Unstructured text description

[1] TED P,SIDDHARTH P,JASON M.Wordnet:Similarity-mea-suring the relatedness of concepts[C]∥HLT-NAACL 2004.Association for Computational Linguistics,2004:38-41.
[2] WU F,WELD D S.Automatically refining the wikipediainfobox ontology[C]∥Proceedings of the 17th International Conference on World Wide Web.ACM,2008:635-644.
[3] WU F,HOFFMANN V,WELD D S.Information extractionfrom wikipedia:Movingdown the long tail[C]∥Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2008:731-739.
[4] LI R,BAO S H,YU Y,et al.Towards effective browsing of large scale social annotations[C]∥Proceedings of the 16th International Conference on World Wide Web.ACM,2007:943-952.
[5] NASTASE V,STRUBE M.Decoding wikipedia categories forknowledge acquisition[C]∥AAAI.2008:1219-1224.
[6] DMBTL G,MIJJB T.Hierarchical topic models and the nested chinese restaurant process[J].Advances in Neural Information Processing Systems,2004,16:17.
[7] MIMNO D,LI W,MCCALLUM A.Mixtures of hierarchical to-pics with pachinko allocation[C]∥Proceedings of the 24th ICML.ACM,2007:633-640.
[8] ZAVITSANON E,PALIOURAS G,VOUROS G A.Non-parametric estimation of topic hierarchies from texts with hierarchical dirichlet processes[J].The Journal of Machine Learning Research,2011,12:2749-2775.
[9] CHUANG S L,CHIEN L F.A practical web-based approach to generating topic hierarchy for text segments[C]∥Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management.ACM,2004:127-136.
[10] TANG J,LEUNGH F,LUO Q,et al.Towards ontology lear-ning from folksonomies[C]∥IJCAI.2009:2089-2094.
[11] ZHU X W,MING Z Y,ZHU X Y, et al.Topic hierarchy construction for the organization of multi-source user generated contents[C]∥Proceedings of the 36th International ACMSIGIR Conference on Research and Development in Information Retrieval.ACM,2013:233-242.
[12] NAVIGLI R,VELARDI P,FARALLI S.A graph-based algo-rithm for inducing lexicaltaxonomies from scratch[C]∥IJCAI.2011:1872-1877.
[13] MONGE A E,ELKAN C,et al.The field matching problem:algorithms and applications[C]∥Proceedings of the 2nd ACM SIGKDD.1996:267-270.
[14] CHU Y J,LIU T H.On shortest arborescence of a directedgraph[J].Scientia Sinica,1965,14(10):1396.
[15] VINH N X,EPPS J,BAILEY J.Information theoretic measures for clusterings comparison:is a correction for chance necessary?[C]∥Proceedings of the 26th Annual International Conference on Machine Learning.ACM,2009:1073-1080.
[16] LIU X,SONG Y,LIU S,et al.Automatictaxonomy construction from keywords[C]∥KDD.2012:1433-1441.
[17] WANG C,DANILEVSKY M,DESAI N,et al.A phrase mining framework forrecursive construction of a topical hierarchy[C]∥KDD.New York,NY,USA,ACM,2013:437-445.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!