计算机科学 ›› 2015, Vol. 42 ›› Issue (2): 233-236.doi: 10.11896/j.issn.1002-137X.2015.02.048

• 人工智能 • 上一篇    下一篇

动态话题追踪中的时序权重

吴树芳,徐建民   

  1. 河北大学管理学院 保定071002,河北大学数学与计算机学院 保定071002
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受中国博士后科学基金资助

Temporal Weight in Dynamic Topic Tracking

WU Shu-fang and XU Jian-min   

  • Online:2018-11-14 Published:2018-11-14

摘要: 在贝叶斯信念网络的基础上,给出了一个新的动态话题追踪模型作为文章的表示模型。依据时间距离量化动态话题追踪中的时序信息,并将其应用于特征权重的动态调整。考虑到较长时间没有再现的特征权重应该衰减,给出了权重衰减函数,若衰减后的特征权重低于一定的阈值,则将其视为冗余信息。实验采用TDT4测试集合和DET曲线进行评测,通过反复实验获得基于TDT语料的最优时间距离阈值α和决定是否为冗余特征的阈值β。实验证明,使用时序权重后可有效提高动态话题追踪模型的追踪性能。

关键词: 话题追踪,时序权重,衰减,贝叶斯信念网络

Abstract: A new dynamic topic tracking model was proposed based on Bayesian belief network,which is used as the representation model in this paper.We used time distance to quantify temporal information which is then used to dynamically adjust feature weight.A weight decay function was given to deal with the long-time disappearing features.If the weight of a feature is lower than the given threshold after decaying,the feature will be viewed as redundant information.TDT4 corpora and DET curves were used to run experiments.We firstly obtained the optimal time distance threshold α and the threshold β to determine whether a feature is redundant information.Experimental results show that the tracking performance of dynamic topic models can be effectively improved by using temporal weight.

Key words: Topic tracking,Temporal weight,Decay,Bayesian belief network

[1] Allan J,Papka J R,Lavrenko V.On-Line new event detectionand tracking[C]∥the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1998.New York:ACM,1998:37-75
[2] Larkey L S,Feng F F,Connell M,et al.Language-specific mo-dels in multilingual topic tracking[C]∥Proceedings of the 27th Annual Int’l Conference on Research and Development in Information Retrieval,2004.New York:ACM,2004:402-409
[3] Nallpati R.Semantic language models for topic detection andtracking[C]∥Proceedings of the HLT-NAACL 2003 Student Research Workshop,2003.USA,2003:1-6
[4] 洪宇,张宇,范基礼,等.基于语义域语言模型的中文话题关联检测[J].软件学报,2008,19(9):2265-2275
[5] 于满泉,骆卫华,许洪波,等.话题识别与跟踪中的层次化话题识别技术研究[J].计算机技术与发展,2006,43(3):489-495
[6] 赵华,赵铁军,于浩,等.面向动态演化的话题检测研究[J].高技术通讯,2006,12(16):1230-1235
[7] 洪宇,仓玉,姚建民.话题跟踪中静态和动态话题模型的核捕捉衰减[J].软件学报,2012,23(5):1100-1119
[8] Thomas T,Stumpf M P.Inference of temporally varying Bayesian networks[J].Systems Biology,2012,28(24):3298-3305
[9] 邓冬梅,朱建,陈端兵,等.时序阵发性对信息传播的影响[J].计算机科学,2013,40(11A):26-28
[10] Yang Y,Pierce T,Carbonell J.A study on retrospective and On-Line Event detection[C]∥the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1998.New York:ACM,1998:28-36
[11] 贾自艳,何清,张俊海,等.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280
[12] 赵华,赵铁军,张姝,等.基于内容分析的话题检测研究[J].哈尔滨工业大学学报,2006,10(38):1740-1743
[13] 洪宇.基于语义结构和时序特征的话题检测与跟踪技术研究[D].哈尔滨:哈尔滨工业大学,2009
[14] Bao L L,Wen J L,Qin L.Enhancing topic tracking with tempo-ral information[C]∥the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2006.New York:ACM,2006:667-668 (下转第240页)(上接第236页)
[15] Kimura M,Saito K,Ohara K,et al.Opinion formation by voter model with temporal decay dynamics[J].Machine Learning and Knowledge Discovery in Databases,2012,7524:565-580
[16] 廖君华,孙克迎,钟丽霞.一种基于时序主题模型的网络热点话题演化分析系统[J].图书情报工作,2013,57(9):96-102
[17] Kaur K.Topic tracking techniques for natural language processing[C]∥ACAI ’11 Proceedings of the International Conference on Advances in Computing and Artificial Intelligence.2011:65-71
[18] 张晓艳,王挺,梁晓波.LDA模型在话题追踪中的应用[J].计算机科学,2011,38(10A):136-152
[19] The National Institute of Standards and Technology (NIST).The 2005 Topic Detection and Tracking (TDT2005) Task Definition and Evaluation Plan [Z].ftp://jaguar.ncsl.nist.gov//tdt/tdt2005/.Eval.Plan.Vllps
[20] 骆卫华,刘群,程学旗.话题检测与跟踪技术的发展与研究[J].全国计算语言学联合学术会议,2003,10(38):1740-1743
[21] de Cristo M A P,Calado P P,de Lourdes da Silveira M,et al.Bayesian belief networks for IR[J].International Journal of Approximate Reasoning,2003,34:163-179
[22] Namsrai E,Munkhdalai T,Li M J,et al.A feature selection-based ensemble method for arrhythmia classification[J].Journal of Information Processing Systems,2013,9(1):31-40
[23] 朱靖波,陈文亮.基于FIFA的主题相似性计算模型[J].东北大学学报:自然科学版,2003,24(11):1041-1044
[24] Dash S K,Reddy K S,Pujari A K.Adaptive Naive Bayes method for masquerade detection[J].Journal of security and communication networks,2011,4(4):410-417
[25] 张晓艳.新闻话题表示模型和关联追踪技术研究[D].长沙:国防科技大学,2010
[26] Martin A,Doddington G,Kamm T,et al.The DET curve in assessment of detection task Performance[C]∥Eurospeech,1997.1997:37-41

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!