Computer Science ›› 2012, Vol. 39 ›› Issue (9): 175-179.

Previous Articles     Next Articles

Study of Automatic Keywords Labeling for Scientific Literature

  

  • Online:2018-11-16 Published:2018-11-16

Abstract: Keywords of scientific literatures provided by authors are helpful for readers. But there are also some scienti- fic literatures that are not labeled with keywords due to all sorts of reasons. So this paper proposed a new abstract based automatic keywords prediction algorithm for scientific literatures without keywords. I}he abstracts of scientific litera- tures,which had been given keywords by authors,were used as the training data set. Four text modeling methods:lan- guagc modcl(LM),latent dirichlet allocation(LDA),probabilistic author-topic model, and a combination of LM and I_DA were employed to model the abstracts and the keywords in training set to build the relations between keywords and terms of abstracts. Then the trained models were used to predict keywords for the abstracts of scientific literatures without keywords. The experimental results on both Chinese data sets and English data sets show that the keywords predicted by the proposed algorithms can reflect the content of scientific literature well. Among all of the models, the combination of LM and LDA is best.

Key words: Language model, Tag prediction, Latcnt dirichlct allocation, Probabilistic author-topic model

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!