计算机科学 ›› 2011, Vol. 38 ›› Issue (Z10): 136-139.

• CRSSC-CWI-CGrC2015 • 上一篇    下一篇

LDA模型在话题追踪中的应用

张晓艳,王挺,梁晓波   

  1. (国防科技大学人文与社会科学学院 长沙410074) (国防科技大学计算机学院 长沙410073)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金(60873097,60933005)资助。

Use of LDA Model in Topic Tracking

ZHANG Xiao-yan,WANG Ting,LIANG Xiao-bo   

  • Online:2018-11-16 Published:2018-11-16

摘要: 随着对LDA模型的研究越来越深入,文本表示和挖掘能力进一步提高。“话题”是LDA模型中一个非常重要的概念,是特征集合的一个多项式概率分布。话题追踪是根据少数已知相关信息在未知报道流中追踪一个话题,找出与该话题相关的所有报道。把LDA模型用于话题追踪,目的有两个:(一)检验LD八话题对追踪话题的表示能力;(二)检验LDA模型在挖掘训练数据中的追踪话题时,LDA话题和追踪话题之间的关系。实验表明:相对于经典的向量空间模型和一元语言模型,以及专门针对追踪话题提出的事件模型,基于LDA模型的追踪性能更好,但由于粒度不同,LDA模型中的话题和追踪话题并没有直接的一一对应的关系,实现可定制话题的LDA模型是下一步工作的目标。

关键词: LDA模型,话题追踪,话题

Abstract: As more and more researches are made for the LDA model, its ability of representing and mining has been increased a lot. " Topic" is an important concept in the I_DA model, which is represented as a polynomial distribution of the feature set. Topic tracking is monitoring a stream of news stories to find additional stories on a topic identified by several samples. There are two reasons for using the LDA model in topic tracking; one is to show how the performance of the tracking system using the I_DA model is; the other is trying to find whether there is some relation between the LDA topic and the tracked topic. The experimental results indicate that the LDA model is better than the vector space model,the unigram language model and the special event model in a topic tracking system However, since the granulari- tics of two kinds of topics arc different, the relation between the LDA topic and the tracked topic is not about bijection.An adjustable LDA model is needed in our future work.

Key words: LDA model, Topic tracking, Topic

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!