计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 77-81.doi: 10.11896/j.issn.1002-137X.2016.09.014

• 2015 年第三届CCF 大数据学术会议 • 上一篇    下一篇

基于实体关系网络的微博文本摘要

薛竹君,杨树强,束阳雪   

  1. 国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家863高技术研究发展计划(2012AA01A401)资助

Microblog Text Summarization Based on Entity Relation Network

XUE Zhu-jun, YANG Shu-qiang and SHU Yang-xue   

  • Online:2018-12-01 Published:2018-12-01

摘要: 在解析 微博文本语法的基础上,结合实体关系的定义和形式化表示,提出了采用关系网络有向图模型的方法来反映文本之间的结构关系,较好地表达了文本的语义信息,弥补了词频特征刻画的不足之处。利用改进后的TPR(Topic-PAGERANK)测算各节点对应的度来表现关系元组的重要程度,按序输出关系元组对应的原博文语义字段作为摘要。最后,通过实验证明了基于关系网络的文本自动文摘方法抽取出的摘要涵盖信息更全面,冗余更少。

关键词: 实体关系,短文本,文本表示,语法分析,Topic-PAGERANK

Abstract: On the basis of syntax parsing,combining the definition of entity relationship and formalized representation,this paper put forward a method based on directed graph model to reflect the structured relationship between texts,expressing text semantic information,making up for the shortcomings of word frequency characteristics.After that,the corresponding value of each node is measured with improved TPR (Topic-PAGERANK) to represent the importance of the relationship group.Then the corresponding original microblog text of relational tuples is sequentially outputed.Finally,it is proved by experiments that the text summarization extracted by automatic text summarization method based on relational tuple is more comprehensive and less redundant.

Key words: Entity relationship,Short text,Text expression,Syntax parsing,Topic-PAGERANK

[1] Harabagiu S,Hickl A.Relevance modeling for microblog summarization[C]∥Fifth International AAAI Conference on Weblogs and Social Media.2011
[2] Long R,Wang H,Chen Y,et al.Towards effective event detection,tracking and summarization on microblog data[M]∥Web-Age Information Management.Springer Berlin Heidelberg,2011:652-663
[3] Zhao W X,Jiang J,He J,et al.Topical keyphrase extractionfrom twitter[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1.Association for Computational Linguistics,2011:379-388
[4] Zhong Z,Liu Z.Ranking events based on event relation graph for a single document[J].Information Technology Journal,2010,9(1):174-178
[5] Guo Xi-yue,He Ting-ting,Hu Xiao-hua,et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J].Jounal of Chinese Informatio Processing,2014,28(6):183-189(in Chinese) 郭喜跃,何婷婷,胡小华,等.基于句法语义特征的中文实体关系抽取[J].中文信息学报,2014,28(6):183-189
[6] Zhao W X,Jiang J,Weng J,et al.Comparing twitter and traditional media using topic models[M]∥Advances in Information Retrieval.Springer Berlin Heidelberg,2011:338-349
[7] Chen Dan-qi,Manning C D.A Fast and Accurate DependencyParser using Neural Networks[C]∥Proceedings of EMNLP 2014.2014
[8] Nenkova A,Maskey S,Liu Y.Automatic summarization[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Tutorial Abstracts of ACL 2011.Association for Computational Linguistics,2011
[9] Zhao W X,Jiang J,He J,et al.Topical keyphrase extractionfrom twitter[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1.Association for Computational Linguistics,2011:379-388

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!