计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 116-121.doi: 10.11896/jsjkx.180901759

• 网络与通信 • 上一篇    下一篇

一种基于通联数据的信息扩散路径推测算法

项英倬, 魏强, 游凌   

  1. (盲信号处理国家重点实验室 成都610041)
  • 收稿日期:2018-09-17 修回日期:2018-12-12 出版日期:2019-10-15 发布日期:2019-10-21
  • 通讯作者: 魏强(1987-),男,工程师,主要研究方向为信息处理,E-mail:weiqianglg@163.com。
  • 作者简介:项英倬(1990-),男,博士生,主要研究方向为人工智能,E-mail:xiangyzh@foxmail.com;游凌(1971-),男,研究员,主要研究方向为网络态势感知。
  • 基金资助:
    本文受国家自然科学基金(61174124)资助。

Information Diffusion Path Inferring Algorithm Based on Communication Data

XIANG Ying-zhuo, WEI Qiang, YOU Ling   

  1. (National Key Laboratory of Science and Technology on Blind Signal Processing,Chengdu 610041,China)
  • Received:2018-09-17 Revised:2018-12-12 Online:2019-10-15 Published:2019-10-21

摘要: 信息的传播和扩散对于研究市场营销、病毒木马的传播等具有重要意义。但是,在许多场景下仅能获取网络中用户的通联数据,难以获取用户间通信的内容。针对该问题,文中提出了一个基于概率的信息传播模型来对网络中的通联数据进行建模,以此估计网络中用户通信内容的相关性,进而推测网络中信息的扩散路径。文中证明了求解该模型的复杂度为NP-hard,并提出了PathMine算法来获取模型的一个近似最优解。实验表明,所提PathMine算法能够高效地挖掘网络中信息的传播模式,优于已知的其他方法。

关键词: 网络分析, 信息扩散, 信息流, 子模函数

Abstract: Information diffusion and propagation play an important role in viral marketing and virus diffusion.How-ever,in many occasions only the connected data of users in the network can be obtained,which makes it difficult to obtain the content of communication between users.To deal with such challenges,this paper proposed an information diffusion model based on probability in order to predict the relativity of communications between users,and then infer the diffusion path of information in the network.In addition,this paper proved that the complexity of solving the model is NP-hard,and proposed PathMine algorithm to get a near optimal solution.Experiment results show that the proposed PathMine algorithm outperforms other state-of-art algorithms.

Key words: Information diffusion, Information flow, Network analysis, Submodule function

中图分类号: 

  • TP311
[1]LAZARSFELD P F,KATZ E,ROPER E.Personal influence:The part played by people in the flow of mass communications[M].New York:Routledge,2017.
[2]ROGERS E M.Diffusion of innovations[M].Simon and Schuster,2010.
[3]WATTS D J,DODDS P S.Influentials,networks,and public opinion formation[J].Journal of Consumer Research,2007,34(4):441-458.
[4]MCCALLUM A,WANG X,CORRADA-EMMANUEL A. Topic and role discovery in social networks with experiments on enron and academic email[J].Journal of Artificial Intelligence Research,2007,30:249-272.
[5]ZHANG Y,WU Y,YANG Q.Community Discovery in Twitter Based on User Interests[J].Journal of Computational Information Systems,2012,8(3):991-1000.
[6]FEI H,JIANG R,YANG Y,et al.Content based social behavior prediction:a multi-task learning approach[C]//Proceedings of the 20th ACM International Conference on Information and Knowledge Management.ACM,2011:995-1000.
[7]LIN C,MEI Q,JIANG Y,et al.Inferring the diffusion and evolution of topics in social communities[C]//11th IEEE International Conference on Data Mining(ICDM 2011).Vancouver:IEEE,2011.
[8]ZHU J,XIONG F,PIAO D,et al.Statistically modeling the effectiveness of disaster information in social media[C]//Global Humanitarian Technology Conference (GHTC).IEEE,2011:431-436.
[9]ALTENBURGER,KRISTEN M,JOHAN U.Monophily in social networks introduces similarity among friends-of-friends[J].Nature Human Behaviour,2018,2(4):284.
[10]YANG M,HSU W H,KALLUMADI S T.Predictive Analytics of Social Networks:A Survey of Tasks and Techniques[J].Journal of Applied Physics,2014,112(1):014301-014301-5.
[11]ALBERT-LÁSZLÓ B.The origin of bursts and heavy tails in human dynamics[J].Nature,2005,435(7039):207.
[12]MALMGRENR D,STOUFFER D B,MOTTER A E,et al.A Poissonian explanation for heavy tails in e-mail communication[J].Proceedings of the National Academy of Sciences,2008,105(47):18153-18158.
[13]LESKOVEC J,MCGLOHON M,FALOUTSOS C,et al.Cascading behavior in large blog graphs: Patterns and a model.arXiv:0704.2803.
[14]KHULLER S,MOSS A,NAOR J.The budgeted maximum co-verage problem[J].Information Processing Letters,1991,70(1):39-45.
[15]LESKOVEC J,KRAUSE A,GUESTRIN C,et al.Cost-effective outbreak detection in networks[C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.ACM,2007:420-429.
[16]GOYAL A,BONCHI F,LAKSHMANAN L V S.Learning influence probabilities in social networks[C]//Proceedings of the third ACM International Conference on Web Search and Data Mining.ACM,2010:241-250.
[17]LESKOVEC J,FALOUTSOS C.Scalable modeling of real graphs using kronecker multiplication[C]//Proceedings of the 24th International Conference on Machine Learning.ACM,2007:497-504.
[18]ERDS P,RÉNYI A.On the evolution of random graphs[J]. Transactions of the American Mathematical Society,2011,286(1):257-274.
[19]CLAUSET A,MOORE C,NEWMAN M E J.Hierarchical structure and the prediction of missing links in networks[J].Nature,2008,453(7191):98.
[20]GROVER A,ZWEIG A,ERMON S.Graphite:Iterative generative modeling of graphs[J].arXiv:1803.10459,2018.
[21]SHETTY J,ADIBI J.The Enron email dataset database schema and brief statistical report[R].Information Sciences Institute Technical Report,University of Southern California,2004:120-128.
[22]COHEN W W.Enron Email Data Set[OL].http://www.cs.cmu.edu/~enron/.
[1] 谭琪, 张凤荔, 张志扬, 陈学勤.
社交网络用户影响力的建模方法
Modeling Methods of Social Network User Influence
计算机科学, 2021, 48(2): 76-86. https://doi.org/10.11896/jsjkx.191200102
[2] 刘明聪, 王娜, 周宁.
基于依赖分析的云组合服务信息流控制机制
Dependency Analysis Based Cloud Composition Service Information Flow Control Mechanism
计算机科学, 2019, 46(4): 189-196. https://doi.org/10.11896/j.issn.1002-137X.2019.04.030
[3] 王雪健, 赵国磊, 常朝稳, 王瑞云.
信息流格模型的非法流分析
Illegal Flow Analysis for Lattice Model of Information Flow
计算机科学, 2019, 46(2): 139-144. https://doi.org/10.11896/j.issn.1002-137X.2019.02.022
[4] 闫佳琪, 陈俊华冷晶.
大规模网络总体通讯性及其效率评价分析
Total Communication and Efficiency Analysis of Large Scale Networks
计算机科学, 2018, 45(6A): 283-289.
[5] 朱浩,陈建平.
软件系统的可信降密述评
Review of Trust Declassification for Software System
计算机科学, 2018, 45(6A): 36-40.
[6] 范艳芳.
协作环境下的时空约束强制访问控制模型
Temporal-Spatial-based Mandatory Access Control Model in Collaborative Environment
计算机科学, 2017, 44(8): 107-114. https://doi.org/10.11896/j.issn.1002-137X.2017.08.020
[7] 杜远志,杜学绘,杨智.
基于混合流策略的按需分布式云信息流控制模型
Mixed Flow Policy Based On-demand Distributed Cloud Information Flow Control Model
计算机科学, 2017, 44(10): 150-158. https://doi.org/10.11896/j.issn.1002-137X.2017.10.029
[8] 杨迎辉,李建华,南明莉,崔琼,王宏.
基于模体的网络化作战信息流转动态超图模型
Networked Operational Information Flowing Dynamic Hypergraph Model Based on Motif
计算机科学, 2016, 43(8): 30-35. https://doi.org/10.11896/j.issn.1002-137X.2016.08.006
[9] 谢杨晓洁,赵凌.
大数据下基于信息流的快速种子用户识别
Precise Identification of Seed Users Based on Information Flow in Big Data
计算机科学, 2016, 43(7): 281-284. https://doi.org/10.11896/j.issn.1002-137X.2016.07.051
[10] 杨迎辉,李建华,南明莉,田言涛.
基于信息流的作战体系结构建模
Operational Architecture Modeling Based on Information Flow
计算机科学, 2016, 43(2): 13-18. https://doi.org/10.11896/j.issn.1002-137X.2016.02.003
[11] 金 丽,朱 浩.
基于自动机监控的二维降密策略
Declassification Policy Based on Automaton Monitoring
计算机科学, 2015, 42(7): 194-199. https://doi.org/10.11896/j.issn.1002-137X.2015.07.043
[12] 张钰,刘胜美.
基于多属性判决的网络选择算法
Network Selection Algorithm Based on Multi-attribute Decision
计算机科学, 2015, 42(6): 120-124. https://doi.org/10.11896/j.issn.1002-137X.2015.06.027
[13] 何鹏,李兵,杨习辉,熊伟.
开源软件社区开发者偏好合作行为研究
Research on Developer Preferential Collaboration in Open-source Software Community
计算机科学, 2015, 42(2): 161-166. https://doi.org/10.11896/j.issn.1002-137X.2015.02.035
[14] 邵婧,陈左宁,殷红武,许国春.
面向PaaS云的信息流控制框架设计与实现
Design and Implementation of Information Flow Control Framework for PaaS
计算机科学, 2015, 42(12): 257-262.
[15] 金丽,朱 浩.
多线程环境中的二维降密策略
Two-dimension Declassification Policy in Multithreaded Environments
计算机科学, 2015, 42(12): 243-246.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!