计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 318-324.doi: 10.11896/JsJkx.191100012

• 计算机网络 • 上一篇    下一篇

面向云服务的分布式消息系统负载均衡策略

高子妍, 王勇   

  1. 北京工业大学信息学部计算机学院 北京 100124
  • 发布日期:2020-07-07
  • 通讯作者: 王勇(wangy@bJut.edu.cn)
  • 作者简介:641909633@qq.com

Load Balancing Strategy of Distributed Messaging System for Cloud Services

GAO Zi-yan and WANG Yong   

  1. College of Computer Science and Technology,Faculty of Information Technology,BeiJing University of Technology,BeiJing 100124,China
  • Published:2020-07-07
  • About author:GAO Zi-yan, bachelor.Her main research interests include distributed computing and big data.
    WANG Yong, born in 1974, Ph.D, associate professor.His main researchinte-rests include parallel and distributed computing.

摘要: 针对云服务下分布式消息系统存在的节点间负载倾斜问题,提出基于副本角色的动态负载均衡策略,并将算法应用于Apache Kafka分布式流平台中。基于消息系统的主要功能为读写及存储消息,算法以CPU、磁盘、网络读写流量为节点的主要负载因素,并根据不同的负载类型提出相应的首领角色迁移策略和副本迁移策略。从时间代价、空间代价、服务可用性等多个角度论证该算法的可行性,并讨论算法中涉及参数对算法执行效果的影响。经实验验证,所提算法能够实现集群中各节点的资源使用量均不大于规定阈值,并且与缺省系统相比,集群CPU占用率均方差下降72.1%,磁盘占用率均方差下降86.1%,网络流入速度均方差下降79.2%,网络流出速度均方差下降63.9%,优化效果显著。

关键词: Apache Kafka, 多副本机制, 分布式消息系统, 负载均衡, 云服务

Abstract: Aiming at the problem of load skew between nodes in distributed messaging systems under cloud services,a dynamic load balancing strategy based on the role of replica is proposed and the algorithm is applied to Apache Kafka,the distributed streaming platform.Because the function of the messaging system is to read,write and store messages,the algorithm used CPU,disk and bytes in/out as the main load factors of nodes,and proposed the corresponding Leadership Movement strategy and Replica Movement strategy according to different load types.The feasibility of the algorithm is demonstrated from the perspectives of time cost,space cost and service availability,and the influence of parameters involved in the algorithm on the execution of the algorithm was discussed.Experiment results show that,the algorithm can achieve that the resource usage of each node in the cluster is not greater than the specified threshold.Compared with the default system,the standard deviation of cluster CPU occupancy rate decreases by 72.1%,the standard deviation of disk occupancy rate decreases by 86.1%,the standard deviation of bytes in rate decreases by 79.2%,and the standard deviation of bytes out rate decreases by 63.9%.The optimization effect is remarkable.

Key words: Apache Kafka, Cloud service, Distributed messaging system, Load balancing, Multi-replica mechanism

中图分类号: 

  • TP393.4
[1] MENG X F,CI X.Big data management:concepts,techniques and challenges .Journal of Computer Research and Development,2013,50(1):146-169.
[2] LUO Z J,JIN J H,SONG A B,et al.Cloud computing: architecture and key technology.Journal on Communications,2011.
[3] DUAN Y C,WANG D P.A comparative study on SLA content of cloud service contract.Computer and Networks,2018(21).
[4] VILLARS R L,OLOFSON C W,EASTWOOD M.Bigdata:What it is and why you should care.IDC Analyze the Futhre White Paper,2011.
[5] RABL T,GOMEZ-VILLAMOR S,SADOGHI M,et al.Solving big data challenges for enterprise application performance management//Proceedings of the VLDB Endowment.2012:1724-1735.
[6] COLLINS R L,CARLONI L P.Flexible filters:load balancing through backpressure for stream programs//Proceedings of the Seventh ACM International Conference on Embedded Software.New York:ACM,2009:205-214.
[7] BELLAVISTA P,et al.Quality of Service in Wide Scale Publish-Subscribe Systems.IEEE Communications Surveys & Tutorials,2014,16(3):1591-1616.
[8] Apache Kafka.http://kafka.apache.org/.
[9] BIRAJDAR P M,UJEDE K,YALAWAR R,et al.Bidirectional Hadoop kafka Managing Messaging Bus.International Research Journal of Engineering and Technology(IRJET),2016,3(3).
[10] AHUJA S P,MUPPARAJU N.Performance evaluation and comparison of distributed messaging using message oriented middleware.Computer and Information Science,2014,7(4):9-16.
[11] VIDELA A,WILLIAMS J J W.RabbitMQ in action: distributed messaging for everyone.Manning About this Chapter Title Evaluation of Fairness in Message Broker System,2012.
[12] NARKHEDE N,SHAPIRA G,PALINO T.Kafka:The Definitive Guide:Real-time Data and Stream Processing at Scale.O’Reilly Media,Inc.2017.
[13] KLEPPMANN M,KREPS J.Kafka,Samza and the Unix philosophy of distributed data.Bulletin of the IEEE CS Technical Committee on Data Engineering,2015.
[14] WANG G,et al.Building a Replicated Logging System with Apache Kafka.Proceedings of the VLDB Endowment,2015,8(12):1654-1655.
[15] BYZEK Y.Optimizing Your Apache Kafka Deployment:Leversfor The throughput,Latency,Durability,and Availability.Technical report,Confluent Inc,2017.
[16] JUNQUEIRA F,REED B.ZooKeeper:Distributed Process Coordination.Sebastopol:O’Reilly Media,Inc.2013.
[1] 田真真, 蒋维, 郑炳旭, 孟利民.
基于服务器集群的负载均衡优化调度算法
Load Balancing Optimization Scheduling Algorithm Based on Server Cluster
计算机科学, 2022, 49(6A): 639-644. https://doi.org/10.11896/jsjkx.210800071
[2] 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨.
基于国产众核处理器的深度神经网络算子加速库优化
Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor
计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226
[3] 谭双杰, 林宝军, 刘迎春, 赵帅.
基于机器学习的分布式星载RTs系统负载调度算法
Load Scheduling Algorithm for Distributed On-board RTs System Based on Machine Learning
计算机科学, 2022, 49(2): 336-341. https://doi.org/10.11896/jsjkx.201200126
[4] 梁剑, 何军辉.
基于宏块编码信息自适应置换的H.264/AVC视频加密方法
H.264/AVC Video Encryption Based on Adaptive Permutation of Macroblock Coding Information
计算机科学, 2022, 49(1): 314-320. https://doi.org/10.11896/jsjkx.201100089
[5] 夏中, 向敏, 黄春梅.
基于CHBL的P2P视频监控网络分层管理机制
Hierarchical Management Mechanism of P2P Video Surveillance Network Based on CHBL
计算机科学, 2021, 48(9): 278-285. https://doi.org/10.11896/jsjkx.201200056
[6] 宋海宁, 焦健, 刘永.
高速公路中的移动边缘计算研究
Research on Mobile Edge Computing in Expressway
计算机科学, 2021, 48(6A): 383-386. https://doi.org/10.11896/jsjkx.200900212
[7] 王政, 姜春茂.
一种基于三支决策的云任务调度优化算法
Cloud Task Scheduling Algorithm Based on Three-way Decisions
计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023
[8] 郑增乾, 王锟, 赵涛, 蒋维, 孟利民.
带宽和时延受限的流媒体服务器集群负载均衡机制
Load Balancing Mechanism for Bandwidth and Time-delay Constrained Streaming Media Server Cluster
计算机科学, 2021, 48(6): 261-267. https://doi.org/10.11896/jsjkx.200400131
[9] 蒋慧敏, 蒋哲远.
企业云服务体系结构的参考模型与开发方法
Reference Model and Development Methodology for Enterprise Cloud Service Architecture
计算机科学, 2021, 48(2): 13-22. https://doi.org/10.11896/jsjkx.200300044
[10] 姚泽玮, 林嘉雯, 胡俊钦, 陈星.
基于PSO-GA的多边缘负载均衡方法
PSO-GA Based Approach to Multi-edge Load Balancing
计算机科学, 2021, 48(11A): 456-463. https://doi.org/10.11896/jsjkx.210100191
[11] 王勤, 魏立斐, 刘纪海, 张蕾.
基于云服务器辅助的多方隐私交集计算协议
Private Set Intersection Protocols Among Multi-party with Cloud Server Aided
计算机科学, 2021, 48(10): 301-307. https://doi.org/10.11896/jsjkx.210300308
[12] 杨紫淇, 蔡英, 张皓晨, 范艳芳.
基于负载均衡的VEC服务器联合计算任务卸载方案
Computational Task Offloading Scheme Based on Load Balance for Cooperative VEC Servers
计算机科学, 2021, 48(1): 81-88. https://doi.org/10.11896/jsjkx.200800220
[13] 郭飞雁, 唐兵.
基于用户延迟感知的移动边缘服务器放置方法
Mobile Edge Server Placement Method Based on User Latency-aware
计算机科学, 2021, 48(1): 103-110. https://doi.org/10.11896/jsjkx.200900146
[14] 王国澎, 杨剑新, 尹飞, 蒋生健.
负载均衡的处理器运算资源分配方法
Computing Resources Allocation with Load Balance in Modern Processor
计算机科学, 2020, 47(8): 41-48. https://doi.org/10.11896/jsjkx.191000148
[15] 金琪, 王俊昌, 付雄.
基于智能放置策略的Cuckoo哈希表
Cuckoo Hash Table Based on Smart Placement Strategy
计算机科学, 2020, 47(8): 80-86. https://doi.org/10.11896/jsjkx.191200109
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!