Computer Science ›› 2026, Vol. 53 ›› Issue (2): 387-395.doi: 10.11896/jsjkx.241200020

• Computer Netword • Previous Articles     Next Articles

Multi-objective Optimization for Virtual Machine Placement in Large-scale Hadoop Cluster

WEN Jia1,2,3, WU Shuxia1,2,3, YU Zhengxin4, MIAO Wang5, CHEN Zheyi1,2,3   

  1. 1 College of Computer and Data Science,Fuzhou University,Fuzhou 350116,China
    2 Key Laboratory of Spatial Data Mining & Information Sharing,Ministry of Education,Fuzhou 350002,China
    3 Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing,Fuzhou 350116,China
    4 School of Computing and Communications,Lancaster University,Lancaster LA1 4YW,UK
    5 Department of Computer Science,University of Exeter,Exeter EX4 4QF,UK
  • Received:2024-12-02 Revised:2025-03-18 Published:2026-02-10
  • About author:WEN Jia,born in 2000,postgraduate,is a member of CCF(No.P7488G).Her main research interests include cloud/edge computing and virtual machine placement.
    CHEN Zheyi,born in 1991,Ph.D,professor,Ph.D supervisor, is a member of CCF(No.41902M).His main research interests include cloud/edge computing,resource optimization and machine learning.
  • Supported by:
    National Natural Science Foundation of China(62202103),Natural Science Foundation of Fujian Province for Distinguished Young Fund(2025J010020),Central Funds Guiding the Local Science and Technology Development(2022L3004),Fujian Province Technology and Economy Integration Service Platform(2023XRH001) and Fuzhou-Xiamen-Quanzhou National Independent Innovation Demonstration Zone Collaborative Innovation Platform(2022FX5).

Abstract: Virtualization technology has become the core support for the rapid development of cloud computing.As a popular distributed framework in cloud environments,the performance of the Hadoop cluster is usually limited by the low efficiency of resource management.With the increasing data volume and cluster scale,it is challenging to efficiently optimize Virtual Machine(VM) placement in the Hadoop cluster to reduce energy consumption,increase resource utilization,and lessen file access latency.To address this important challenge,this paper proposes a novel Multi-objective Optimization with Variable Length Double chromosome(MO-VLD) method for VM placement in the large-scale Hadoop cluster.Firstly,a double chromosome structure is designed by combining the variable length chromosome with NSGA-III.Next,two-stage crossover and mutation operations are introduced to enhance the exploration diversity of solution space.Using the real-world runtime datasets of the Google cluster,extensive simulation experiments demonstrate that the proposed MO-VLD method can effectively handle the dynamic resource demands and improve the resource management efficiency of the Hadoop cluster.Compared to benchmark methods,the MO-VLD method shows superior performance in terms of energy consumption,resource utilization,and file access latency.

Key words: Cloud computing, Hadoop, Virtual machine placement, Multi-objective optimization, Genetic algorithm

CLC Number: 

  • TP393
[1]MIAO C,ZHONG Z,XIAO Y,et al.MegaTE:Extending WAN Traffic Engineering to Millions of Endpoints in Virtualized Cloud[C]//Proceedings of the ACM SIGCOMM 2024 Confe-rence.2024:103-116.
[2]WEI C,LI X,YANG Y,et al.Achelous:Enabling Programmability,Elasticity,and Reliability in Hyperscale Cloud Networks[C]//Proceedings of the ACM SIGCOMM 2023 Conference.2023:769-782.
[3]TAKAMORI D.HDFS Users Guide.[EB/OL](2023-06-18).https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html.
[4]TAKAMORI D.MapReduce Tutorial.[EB/OL](2022-05-18).https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html.
[5]CHEN Q,HUANG W,HUANG Y.The Learnable Model-Based Genetic Algorithm for the IP Mapping Problem[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2022,42(7):2350-2363.
[6]AYERDI J,TERRAGNI V,JAHANGIROVA G,et al.Automatically Generating Metamorphic Relations via Genetic Programming[J].arXiv:2312.15302,2023.
[7]BRAIKI K,YOUSSEF H.Multi-objective virtual machine placement algorithm based on particle swarm optimization[C]//2018 14th International Wireless Communications & Mobile Computing Conference(IWCMC).IEEE,2018:279-284.
[8]BHATT C,SINGHAL S.Hybrid Metaheuristic Technique for Optimization of Virtual Machine Placement in Cloud[J].International Journal of Fuzzy Logic and Intelligent Systems,2023,23(3):353-364.
[9]SRIVASTAVA A,KUMAR N.Virtual Machine AllocationUsing Genetic-Based Algorithm in Cloud Infrastructure[C]//Proceedings of Second International Conference on Computational Electronics for Wireless Communications:ICCWC 2022.Singapore:Springer,2023:273-282.
[10]YARAHMADI A,MOMTAZPOUR M.VM placement in acce-lerator-equipped data centers using variable-length modified genetic algorithm[C]//2021 29th Iranian Conference on Electrical Engineering(ICEE).IEEE,2021:562-567.
[11]SWAIN S R,PARASHAR A,SINGH A K,et al.A Multi-objective Virtual Machine Placement Optimization in Sustainable Cloud Environment[C]//International Conference on Deep Learning,Artificial Intelligence and Robotics.Cham:Springer,2023:415-426.
[12]TANG M,PAN S.A Hybrid Genetic Algorithm for the Energy-Efficient Virtual Machine Placement Problem in Data Centers[J].Neural Processing Letters,2015,41(2):211-221.
[13]GOPU A,THIRUGNANASAMBANDAM K,ALGHAMDI AS,et al.Energy-efficient virtual machine placement in distributed cloud using NSGA-III algorithm[J].Journal of Cloud Computing,2023,12(1):124.
[14]CONEJERO J,CAMINERO B,CARRIÓN C.Analysing Hadoop performance in a multi-user IaaS Cloud[C]//2014 International Conference on High Performance Computing & Simulation(HPCS).IEEE,2014:399-406.
[15]HEDAYATI S,MALEKI N,OLSSON T,et al.MapReducescheduling algorithms in Hadoop:a systematic study[J].Journal of Cloud Computing,2023,12(1):143.
[16]GUERRERO C,LERA I,BERMEJO B,et al.Multi-Objective Optimization for Virtual Machine Allocation and Replica Placement in Virtualized Hadoop[J].IEEE Transactions on Parallel and Distributed Systems,2018,29(11):2568-2581.
[17]MARQUEZ J,MONDRAGON O H,GONZALEZ J D.An Intelligent Approach to Resource Allocation on Heterogeneous Cloud Infrastructures[J].Applied Sciences,2021,11(21):9940.
[18]LI Y,HEI X.Performance optimization of computing taskscheduling based on the Hadoop big data platform[J].Neural Computing and Applications,2022,37:8181-8192.
[19]GHAZALI R,ADABI S,DOWN D G,et al.A classification of Hadoop job schedulers based on performance optimization approaches[J].Cluster Computing,2021,24(4):3381-3403.
[20]ZHANG Y,ZHANG X.Minimizing Data Access Latencies via Virtual Machine Placement Method in Datacenter[C]//2017 14th International Symposium on Pervasive Systems,Algorithms and Networks & 2017 11th International Conference on Frontier of Computer Science and Technology & 2017 Third International Symposium of Creative Computing(ISPAN-FCST-ISCC).IEEE,2017:197-202.
[21]NAIK S,KALRA M.Big Data Processing with Balanced Resource Utilization[C]//5th International Conference on Next Generation Computing Technologies.2020.
[22]MIRIAM A J,SAMINATHAN R,CHAKARAVARTHI S.Non-dominated Sorting Genetic Algorithm(NSGA-III) for effective resource allocation in cloud[J].Evolutionary Intelligence,2021,14:759-765.
[23]DE MAIO V,KECSKEMETI G,PRODAN R.An improvedmodel for live migration in data centre simulators[C]//Procee-dings of the 9th International Conference on Utility and Cloud Computing.2016:108-117.
[24]Johnwilkes.Google Cluster data[EB/OL].(2020-04-02).ht-tps://github.com/google/cluster-data/blob/master/TraceVer-sion1.md.
[25]Amazon.Amazon EC2instance types[EB/OL].(2024-09-25).ht-tps://aws.amazon.com/cn/ec2/instance-types/.
[1] LIN Bing, JIANG Haiou, TAN Xiao, CHEN Xing , ZHENG Yuheng. Data Placement Strategy Based on Erasure Code in Data Space [J]. Computer Science, 2026, 53(2): 196-206.
[2] WANG Wei, ZHAO Yunlong, PENG Xiaoyu, PAN Xiaodong. TSK Fuzzy System Enhanced by TSVR with Cooperative Parameter Optimization [J]. Computer Science, 2025, 52(7): 75-81.
[3] ZHOU Danying, HUANG Tianhao, LIU Ruming. Research and Practice on Key Technologies for Serverless Computing [J]. Computer Science, 2025, 52(6A): 240700114-6.
[4] HUANG Ao, LI Min, ZENG Xiangguang, PAN Yunwei, ZHANG Jiaheng, PENG Bei. Adaptive Hybrid Genetic Algorithm Based on PPO for Solving Traveling Salesman Problem [J]. Computer Science, 2025, 52(6A): 240600096-6.
[5] ZHOU Kai, WANG Kai, ZHU Yuhang, PU Liming, LIU Shuxin, ZHOU Deqiang. Customized Container Scheduling Strategy Based on GMM [J]. Computer Science, 2025, 52(6): 346-354.
[6] TAN Shiyi, WANG Huaqun. Remote Dynamic Data Integrity Checking Scheme for Multi-cloud and Multi-replica [J]. Computer Science, 2025, 52(5): 345-356.
[7] WANG Sitong, LIN Rongheng. Improved Genetic Algorithm with Tabu Search for Asynchronous Hybrid Flow Shop Scheduling [J]. Computer Science, 2025, 52(4): 271-279.
[8] HU Kangqi, MA Wubin, DAI Chaofan, WU Yahui, ZHOU Haohao. Federated Learning Evolutionary Multi-objective Optimization Algorithm Based on Improved NSGA-III [J]. Computer Science, 2025, 52(3): 152-160.
[9] XU Donghong, LI Bin, QI Yong. Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center [J]. Computer Science, 2025, 52(2): 310-322.
[10] ZHAO Haixia, LI Xin, WEI Yongzhuang. Rank-sorting Hybrid Genetic Algorithm for Search High Quality Balanced Boolean Functions [J]. Computer Science, 2025, 52(12): 351-357.
[11] CHANG Ningyuan, HUANG Ting, ZHANG Huang. Demand Response Scheme for Low Voltage Users Based on Light Weight Blockchains [J]. Computer Science, 2025, 52(11A): 250200125-8.
[12] LI Xiaogeng, HAN Xiao, XIAO Haiyi. Cooperative Defense Method for Network Space Object of Power Monitoring System [J]. Computer Science, 2025, 52(11A): 241200158-7.
[13] LI Pengfei, GUAN Xiancai, ZHU Youjian, LI Yuanqiao, WANG Jun. Optimization and Absolute Scale Recovery of SFM Algorithm in GCP-assisted Colmap Framework [J]. Computer Science, 2025, 52(11A): 250100015-6.
[14] LANG Aoqi, HUANG Weijie, YU Zhiyong, HUANG Fangwan. Spatiotemporal Active-sampling and Joint Inference of Urban Air Quality Data [J]. Computer Science, 2025, 52(11A): 241000116-9.
[15] YU Ping, YAN Hui, BAO Jie, GENG Xiaozhong. MEC Network Task Offloading and Migration Strategy Based on Optimization Model [J]. Computer Science, 2025, 52(11A): 241200215-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!