计算机科学 ›› 2014, Vol. 41 ›› Issue (6): 155-160.doi: 10.11896/j.issn.1002-137X.2014.06.030

• 软件与数据库技术 • 上一篇    下一篇

自适应云端的大规模导出子图提取算法

郭鑫,董坚峰,周清平   

  1. 吉首大学软件服务外包学院 张家界427000;武汉大学信息资源研究中心 武汉430072;吉首大学软件服务外包学院 张家界427000
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受湖南省工业支撑计划重点项目(2012GK2006),湖南省教育厅科学研究项目(12C0291,11C1051),生态旅游湖南省重点实验室开放基金项目(JDSTLY201206),湖南省图书馆学会2013-2014年度重点课题(XHZD1007)资助

Large Scale Induced Subgraphs Mining Algorithm on Self Adaptive Cloud

GUO Xin,DONG Jian-feng and ZHOU Qing-ping   

  • Online:2018-11-14 Published:2018-11-14

摘要: 针对现有云计算平台资源随机调配与传统导出子图挖掘效率较低等问题,进一步提升云计算平台中资源整合利用效率与大规模导出子图挖掘效率,提出了一种自适应云端的大规模导出子图提取算法,以解决资源优化利用与海量图挖掘等问题。首先介绍了云计算概念与导出子图挖掘相关概念以及问题描述;接着根据MapReduce并行处理模型设计了一种自适应任务动态分配算法SAC_TA(Self Adaptive Cloud Dynamic Allocation),它根据计算任务自适用分配系统资源以达到成本消耗的最优;并设计出自适应云端框架,然后基于自适应云端提出了大规模导出子图挖掘算法SFGFF(SAC_TA、Find_VE、G_F1、FindPartFG、FindAllFG),它共分为4个阶段的挖掘,将所有算法应用到自适应云端中可构成整个导出子图挖掘体系;最后在人工模拟数据与真实环境数据下进行了试验, 结果表明,自适应云端运行良好,算法有效可行,具有较高的加速比与运行效率,能有效满足大规模频繁导出子图挖掘的需求。

关键词: 大数据,数据挖掘,云计算,导出子图,子图同构 中图法分类号TP311文献标识码A

Abstract: Aiming at the current puzzles of random resource allocation of cloud computing platform and lower mining efficiency of traditional induced subgraph,promoting the efficiency of resource integration and using of cloud computing platform and large-scale induced subgraph mining,the paper put forward an algorithm of large-scale induced subgraph extraction for self-adaption cloud to solve the problems of resource optimal utilization and massive graph mining.The paper firstly introduced the relevant concepts and problem description of cloud computing and induced subgraph mining,then designed an algorithm SAC_TA of self-adaption task dynamic allocation according to MapReduce parallel processing model,which can comput task self-adaption allocation system resources to reach the optimum of cost wasting,meanwhile designed the self-adaption cloud framework.On the basis of the framework,the paper put forward the massive induced subgraph mining algorithm SFGFF,which includs four stages of mining.And while applying all the algorithms to self-adaption cloud, the whole induced subgraph mining system can be constructed. The experimental result of manual simulation data and real environment data shows that the self-adaption cloud runs well and the algorithms are efficient and feasible,and have higher speed-up ratio and operating efficiency to satisfy the demand of massive frequent induced subgraph mining.

Key words: Big data,Data mining,Cloud computing,Induced subgraph,Subgraph isomorphism

[1] 覃雄派,王会举,杜小勇,等.大数据分析—RDBMS 与MapReduce 的竞争与共生[J].软件学报,2012,3(1):32-45
[2] TDWI Checklist Report:Big Data Analytics.http://tdwi.org/research/ 2010/08Big-Data-Analytics.aspx
[3] 邹兆年,李建中,高宏,等.从不确定图中挖掘频繁子图模式[J].软件学报,2009,0(11):2965-2976
[4] Zou Xiao-hong,Chen Xiao,Guo Jing-feng,et al.An improved algorithm for mining Close Graph[J].ICIC Express Letters Journal of Research and Surveys,2010,4(4):1135-1140
[5] 薛冰,张俊峰,郑超.基于分割图集的频繁闭图挖掘算法[J].计算机应用研究,2011,8(1):61-64
[6] Guo Jing-feng,Chai Ran,Li Jia.Top-down algorithm for mining maximal frequent subgraph[J].Advanced Research on Industry,Information System and Materials Engineering,2011,204-210:1472-1476
[7] 刘勇,李建中,高宏.从图数据库中挖掘频繁跳跃模式[J].软件学报,2010,1(10):2477-2493
[8] 刘文艳.基于深度优先策略的频繁导出子图挖掘算法[D].西安:西安电子科技大学,2009
[9] Gupta S,Raman V,Saurabh S.Maximum r-Regular InducedSubgraph Problem:Fast Exponential Algorithms and Combinatorial Bounds[J].SIAM Journal on Discrete Mathematics,2012,26(4):1758-1780
[10] Lenk A,Klems M,Nimis J,et al.What’s inside the cloud? An Architectural Map of the Cloud Landscape[C]∥ Proceedings of the 2009ICSE Workshop on Software Engineering Challenges of Cloud Computing.2009:23-31
[11] Son J,Choi H,Chung Y D.Skew-tolerant key distribution forload balancing in MapReduce[J].IEICE Transaction on Information and Systems,2012,95(2):677-680
[12] Valiant Leslie G.A bridging model for parallel computation[J].Communication of the ACM,1990,33(3):103-111
[13] Grzegorz M,Matthew A H,Bik Art J C,et al.Pregel:A system for large-scale graph processing[C]∥ Proceedings of the SIGMOD.Indianapolis,Indiana,USA,2010:135-145
[14] Avery C.Giraph:Large-scale graph processing infrastruction on Hadoop[C]∥ Proceedings of the Hadoop Summit.Santa Clara,2011
[15] Tyson C,Neil C,Peter A,et al.MapReduce Online[C]∥ Proceedings of the NSDI.San,Jose,California,USA,2010:33-48
[16] Islam S,Gregoire J C.Giving user an edge:A flexible cloud modeland its application for multimedia[J].Future Generation Computer Systems,2012,28(6):823-832
[17] Samba A.Logical data models for cloud computing architectures[J].IT Professional,2012,14(1):19-26
[18] Huang H,Wang L Q.P&P:A combined Push-Pull model for resource monitoring in cloud computing environment[C]∥ Proceedings of the 2010IEEE 3rd International Conference on Cloud Computing.Miami,Florida,USA,2010:260-267
[19] Tsai Wei-Tek,Sun X,Balasooriya J.Service-oriented cloud computing architectued[C]∥ Proceedings of the 2010Seventh International Conference on Information Technology:New Genera-tions.Las Vegas,NV,USA,2010:684-689
[20] 王桂娟,印鉴,詹卫许.GC-BES:一种新的基于嵌入集的图分类方法[J].计算机科学,2012,9(6):155-158

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!