计算机科学 ›› 2018, Vol. 45 ›› Issue (2): 103-108.doi: 10.11896/j.issn.1002-137X.2018.02.018

• 2017年中国计算机学会人工智能会议 • 上一篇    下一篇

用户频繁通信关系的并行挖掘算法研究

朱鹏宇,鲍培明,吉根林   

  1. 南京师范大学计算机科学与技术学院 南京210023,南京师范大学计算机科学与技术学院 南京210023,南京师范大学计算机科学与技术学院 南京210023
  • 出版日期:2018-02-15 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目:云计算环境下顾及用户关系的手机用户时空轨迹模式挖掘方法研究(41471371)资助

Parallel Algorithm for Mining User Frequent Communication Relationship

ZHU Peng-yu, BAO Pei-ming and JI Gen-lin   

  • Online:2018-02-15 Published:2018-11-13

摘要: 随着移动通信技术和互联网的飞速发展,移动通信设备已经成为大多数人随身携带的工具,这些设备之间因互相通信而产生的数据构成了通信网络。文中提出了一种针对海量通信数据的频繁通信子图并行挖掘算法PMFCS。该算法 在频繁项目集挖掘思想和子图连接规则的基础上, 利用并行计算框架Spark 将所有的图以边为单位分布到各个计算节点,在各个节点统计1阶候选频繁子图,再通过汇总候选子图得到1阶频繁子图。PMFCS算法通过迭代地连接k-1阶子图和1阶子图生成k阶候选子图,再计算k阶候选子图的频繁度,直至k阶频繁子图集合为空集。实验结果表明,该算法可以快速、有效地解决频繁通信关系的挖掘问题。

关键词: 通信网络,频繁子图,频繁通信关系

Abstract: With the rapid development of mobile communication technology and Internet,mobile communication equipment has become a portable tool for most people.A parallel algorithm PMFCS was proposed for mining frequent communication sub-graph of mass communication data.The algorithm is based on the Apriori algorithm and sub-graph connect principle.It uses Spark to distribute all the edges to each computing node,then the 1th-order frequent candidate sub-graphs are distributed to each node,the 1th-order frequent candidate sub-graphs are counted at each node,and the 1th-order sub-graphs are got by summarizing candidate sub-graphs.PMFCS iteratively connects the (k-1)th-order sub-graph and the 1th-order sub-graph to generate kth-order candidate sub-graphs.Subsequently,the algorithm terminates until the kth-order frequent sub-graph set is empty.The experimental results show that PMFCS can mine the frequent communication sub-graph efficiently and quickly.

Key words: Communication network,Frequent sub-graph,Frequent communication relationship

[1] ZAHARIA M,CHOWDHURY M,DAS T,et al.Resilient distributed datasets:A fault-tolerant abstraction for in-memory cluster computing[C]∥Proceedings of the 9th USENIX Confe-rence on Networked Systems Design and Implementation.USENIX Association,2012:2.
[2] DEAN J,GHEMAWAT S.MapReduce:simplified data proces-sing on large clusters[J].Communications of the ACM,2008,51(1):107-113.
[3] NATU M,SADAPHAL V,PATIL S,et al.Mining frequentsubgraphs to extract communication patterns in data-centres[C]∥International Conference on Distributed Computing and Networking.Springer Berlin Heidelberg,2011:239-250.
[4] AGRAWAL R,SRIKANT R.Fast algorithms for mining associa-tion rules[C]∥Proceedings of the 20th VLDB Conference.1994:487-499.
[5] NIRMALA P,SULOCHANA L R,RETHNASAMY N.Centrality measures-based algorithm to visualize a maximal common induced subgraph in large communication networks[J].Know-ledge and Information Systems,2016,6(1):213-239.
[6] PARISUTHMA N,RETHNASAMY N.An algorithm to search edge relaxed query graph with minimum support threshold in large communication networks[C]∥International Conference on Communication Systems and Networks.IEEE,2016:1-6.
[7] INOKUCHI A,WASHIO T,MOTODA H.An apriori-based algorithm for mining frequent substructures from graph data[C]∥European Conference on Principles of Data Mining and Know-ledge Discovery.Springer Berlin Heidelberg,2000:13-23.
[8] KURAMOCHI M,KARYPIS G.Frequent subgraph discovery[C]∥Proceedings of IEEE International Conference on Data Mining.IEEE,2001:313-320.
[9] YAN X,HAN J.gspan:Graph-based substructure pattern mi-ning[C]∥International Conference on Data Mining.IEEE,2002:721-724.
[10] HUAN J,WANG W,PRINS J.Efficient mining of frequent subgraphs in the presence of isomorphism[C]∥International Conference on Data Mining.IEEE,2003:549-552.
[11] WANG W,ZHOU H F,YUAN Q Q,et al.Mining frequent patterns based on graph theory[J].Journal of Computer Research and Development,2005,2(2):230-235.(in Chinese) 汪卫,周皓峰,袁晴晴,等.基于图论的频繁模式挖掘[J].计算机研究与发展,2005,2(2):230-235.
[12] LI X T,LI J Z,GAO H.An efficient frequent subgraphmining algorithm[J].Journal of Software,2007,8(10):2469-2480.(in Chinese) 李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,8(10):2469-2480.
[13] BHUIYAN M A,AL HASAN M.FSM-H:Frequent subgraph mining algorithm in Hadoop[C]∥International Congress on Big Data.IEEE,2014:9-16.
[14] LU W,CHEN G,TUNG A K H,et al.Efficiently extracting frequent subgraphs using mapreduce[C]∥International Confe-rence on Big Data.IEEE,2013:639-647.
[15] Reality commons.http://realitycommons.media.mit.edu/realitymining.html.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!