计算机科学 ›› 2016, Vol. 43 ›› Issue (1): 237-241.doi: 10.11896/j.issn.1002-137X.2016.01.051

• 人工智能 • 上一篇    下一篇

基于事件要素加权的新闻摘要提取方法

郭艳卿,赵锐,孔祥维,付海燕,蒋金平   

  1. 大连理工大学信息与通信工程学院 大连116024;国家信息中心博士后科研工作站 北京100045,大连理工大学信息与通信工程学院 大连116024,大连理工大学信息与通信工程学院 大连116024,大连理工大学信息与通信工程学院 大连116024,大连理工大学信息与通信工程学院 大连116024
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受中国博士后科学基金(20110490343,2013T60090)资助

News-summarization Extraction Method Based on Weighted Event Elements Strategy

GUO Yan-qing, ZHAO Rui, KONG Xiang-wei, FU Hai-yan and JIANG Jin-ping   

  • Online:2018-12-01 Published:2018-12-01

摘要: 为帮助读者从海量新闻报道中快速了解某一事件的来龙去脉,分析了新闻事件中事件要素对生成摘要的影响,结合新闻事件演变式发展的特点,提出了一种基于事件要素加权的新闻摘要提取方法。通过对事件要素的加权,对转移概率矩阵进行改进,有效地按时间顺序提取出摘要信息,使得最后生成的摘要包含更多的新闻要素细节信息,增加了输出时间轴摘要的细节性和可读性。实验结果证明了所提算法的有效性。

关键词: 新闻事件,时间轴摘要,转移概率矩阵,要素加权

Abstract: To facilitate the readers’ fast understanding of the contexts of news events,this paper analyzed the effect of event elements on the summarization generation,and by combining the character of news evolution along the timeline proposed a news-summarization extraction method based on weighted event elements strategy.The transition probability matrix is improved by weighting the event elements,which turns out to be effective in extracting the news timeline summarization.By this way,the generated summarization contains more details of the news elements,whilst becomes more readable to readers.Experimental results demonstrate the superiority of the proposed algorithm.

Key words: News event,Timeline summarization,Transfer probability matrix,Element weighting

[1] Goldstein J,Kantrowitz M,Mittal V,et al.Summarizing textdocuments:sentence selection and evaluation metrics[C]∥ Proceedings of the 22nd Annual International ACM SIGIR Confe-rence on Research and Development in Information Retrieval.Berkeley,1999:121-128
[2] Radev D R,Jing H,Sty M.Centroid-based summarization ofmultiple documents[J].Information Processing and Management,2004,40(6):919-938
[3] Canhasi E,Kononenko I.Multi-document summarization via Archetypal Analysis of the content-graph joint model[J].Know-ledge and Information Systems,2014,41(3):821-842
[4] Cai Xiao-yan,Li Wen-jie.Mutually reinforced manifold-rankingbased relevance propagation model for query-focused multi-do-cument summarization[J].IEEE Transactions on Audio,Speech,and Language Processing,2012,20(5):1597-1607
[5] Ferreira R,de Souza Cabral L,Freitas F,et al.A multi-document summarization system based on statistics and linguistic treatment[J].Expert Systems with Applications,2014,41(13):5780-5787
[6] Xu Yong-dong,Zhang Xiao-dong, Quan Guang-ri,et al.MRS for multi-document summarization by sentence extraction[J].Telecommunication Systems,2013,53(1):91-98
[7] Luo Yi-hui,Xiong Shu-chu.A combination scheme for distributed multi-document summarization[J].Journal of Intelligence,2013,32(11):133-136(in Chinese)罗毅辉,熊曙初.一种集成框架下的分布式多文档自动摘要方法[J].情报杂志,2013,32(11):133-136
[8] Wang Hong-ling,Zhang Ming-hui,Zhou Guo-dong.Chinesemulti-document summarization system based on topic information[J].Computer Engineering and Applications,2012,48(25):132-136(in Chinese)王红玲,张明慧,周国栋.主题信息的中文多文档自动文摘系统[J].计算机工程与应用,2012,48(25):132-136
[9] Canhasi E,Kononenko I.Weighted archetypal analysis of themultielement graph for query-focused multi-document summarization[J].Expert Systems with Applications,2014,41(2):535-543
[10] Swan R,Allan J.Automatic generation of overview timelines[C]∥ACM SIGIR.Athens,2000:49-56
[11] Allan J,Gupta R,Khandelwal V.Temporal summaries of new topics[C]∥ Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New Orleans,2001:10-18
[12] Chieu H L,Lee Y K.Query based event extraction along a timeline[C]∥ACM SIGIR.Sheffield,2004:425-432
[13] Rui Yan,Wan Xiao-jun,Otterbacher J,et al.Evolutionary timeline summarization:a balanced optimization framework via iterative substitution[C]∥ Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Beijing,2011:745-754
[14] Rui Yan,Liang Kong,Huang Cong-rui,et al.Timeline generation through evolutionary trans-temporal summarization[C]∥ Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,Edinburgh,2011:433-443
[15] Page L,Brin S,Motwani R,et al.The PageRank Citation Ran-king:Bringing Order to the Web[R].Stanford Digital Library Technologies Project,1999:1-17
[16] Mei,Qiao zhu,Jian Guo,et al.DivKank:the Inte-rplay of Prest igo and Diversity in information Networks[C]∥ Special Inte-rested Group on Knowledge Discovery in Databases.Washington,United States,2010:1009-1018
[17] Chen Ji-li,Niu Qin-zhou.Duplicated webpages deletion based on feature code[J].Microcomputer Information,2006(3):113-115(in Chinese)陈基漓,牛秦洲.基于特征码的网页去重[J].微计算机信息,2006(3):113-115
[18] Xiong Zhong-yang,Ya Man,Zhang Yu-fang.Detection and climination of similar Web pages based on text structure and string of feature code[J].Journal of Computer Applications,2013,33(2):554-557(in Chinese)熊忠阳,牙漫,张玉芳.基于网页正文结构和特征串的相似网页去重算法[J].计算机应用,2013,3(2):554-557

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!