计算机科学 ›› 2014, Vol. 41 ›› Issue (8): 75-80.doi: 10.11896/j.issn.1002-137X.2014.08.016

• 2013年全国理论计算机科学学术年会 • 上一篇    下一篇

云计算环境下基于模糊聚类的并行调度策略研究

张千,梁鸿,郉永山   

  1. 中国石油大学计算机与通信工程学院 青岛266555;中国石油大学计算机与通信工程学院 青岛266555;中国石油大学计算机与通信工程学院 青岛266555
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受中国石油天然气集团公司石油科技中青年创新基金项目(07E1024),中央高校基本科研业务费专项资金(13CX02032A)资助

Cloud Parallel Task Scheduling Algorithm Based on Fuzzy Clustering

ZHANG Qian,LIANG Hong and XING Yong-shan   

  • Online:2018-11-14 Published:2018-11-14

摘要: 并行任务调度是分布式计算研究的核心问题之一,其结合大规模的石油地震勘探数据在处理过程中对高性能计算的需求,研究云计算环境下地震资料数据的并行调度问题。由于地震资料数据规模较大,因此通常将大作业进行分割,通过并行处理获得较高的处理效率。对任务进行并行处理的一个关键问题是如何将划分后的任务分配到合适的调度节点,最高效的情况是使云计算环境中的每一个资源节点都在进行计算,并且计算性能高的节点执行作业块大且复杂的任务,性能相对低的资源节点则运行对计算性能要求不高的任务或小任务,以达到整体上的负载平衡。因此基于模糊聚类思想,提出一种任务与资源混合聚类的调度优化策略,以作业与资源节点属性的匹配程度为基准,对并行作业进行聚类划分求解,在缩小任务调度规模的同时,为动态调度任务奠定基础。在划分完成后引入基于改进的贝叶斯分类调度算法,将资源节点依据其实时负载情况与队列中的作业进行快速的匹配。实验证实,此方案具有较高的执行效率。

关键词: 云计算,并行调度,模糊聚类,任务资源混合聚类,贝叶斯分类算法

Abstract: Parallel task scheduling is one of the key problems in the field of cloud computing research area,which mainly researches parallel scheduling problems in cloud computing environment by the reference to the high performance computing required by massive oil seismic exploration data processing.Because of the natural reparability of Seismic data,it can maximize the full use of computing resources to put the job file to the resource nodes,which can just meet the task computing requirements.This paper proposed scheduling optimization strategy of task and resource hybrid clustering based on fuzzy clustering.The strategy takes matching degree of task and resource nodes as reference and with the clustering partition solution of concurrent job,narrows task scheduling scale and at the same time,lays foundation for the dynamic scheduling of tasks.After the division is completed,improved Bayes classification algorithm is introduced to fast match tasks and computer according to real-time load and queue operations.In the end,the experiments verify that this scheme has higher efficiency.

Key words: Cloud computing,Parallel scheduling,Fuzzy clustering,Task and resource hybrid clustering,Bayes classification algorithm

[1] 赵春燕.云环境下作业调度算法研究与实现[D].北京:北京交通大学,2009
[2] 罗银河,刘江平,俞国柱.叠前深度偏移述评[J].物探与化探,2007,8(6):540-546
[3] Dean J,Ghemawat S.Mapreduce:Simplified Data Processing on Large Clusters[J].Communications of the ACM,2008,1:107-109
[4] 李文娟,张启飞.基于模糊聚类的云任务调度算法[J].通信学报,2012,3(3):146-153
[5] 李柏年.一种改进的模糊C-均值算法[J].计算机应用与软件,2009,25(6):21-25
[6] 夏祎.Hadoop平台下的作业调度算法研究与改进[D].广州:华南理工大学,2010
[7] 余正祥.基于学习方式对Hadoop作业调度的改进研究[J].计算机科学, 2012,9(6):220-224
[8] Hand D J.机器学习十大算法:朴素贝叶斯[M].Taylor & Francis Group,LLC,2009:163-178
[9] Zhang Shu-fen,Zhang Shuai,Chen Xue-bin,et al.Analysis and Research of Cloud Computing System Instance[J].2010 Second International Conference on Future Networks,2010,60:88-92
[10] Randles M,Lamb D,Taleb-Bendiab A.A comparative studyinto distributed load balancing algorithms for cloud computing[C]∥Proc of the 24th IEEE International Conference on Advanced Information Networking and Applications Workshops.Fukuoka,Japan,2011:551-556
[11] Hu Jin-hua,Gu Jian-hua,Sun Guo-fei.A scheduling strategy onload balancing of virtual machine resources in cloud computing environment[C]∥Proc of the 3rd International Symposium on ParallelArchitectures,Algorithms and Programming.Liaoning,China,2010:89-96

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!