Computer Science ›› 2025, Vol. 52 ›› Issue (3): 268-276.doi: 10.11896/jsjkx.240100126

• Artificial Intelligence • Previous Articles     Next Articles

Automatic Scheduling Search Optimization Method Based on TVM

HAN Lin1, WANG Yifan2, LI Jianan1, GAO Wei1   

  1. 1 National Supercomputing Center in Zhengzhou,Zhengzhou University,Zhengzhou 450001,China
    2 School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
  • Received:2024-01-12 Revised:2024-06-23 Online:2025-03-15 Published:2025-03-07
  • About author:HAN Lin,born in 1978,Ph.D,associate professor,is a senior member of CCF(No.16416M).His main research interests include compiler optimization and high-performance computing.
    GAO Wei,born in 1988,Ph.D.His main research interests include AI complier optimation and advanced compilation technology.
  • Supported by:
    Major Science and Technology Special Project of Henan Province(221100210600).

Abstract: With the rapid development of artificial intelligence and the continuous emergence of new operators and hardware,the development and maintenance of operator libraries face enormous challenges.Relying solely on manual optimization can no longer meet the needs of improving AI model performance.Ansor is an operator automatic scheduling technique based on TVM,which can search for the best scheduling schemes for different backend deep learning models or operators,generate high-performance code without the need for users to manually define templates.However,the huge search space results in low search efficiency.Therefore,two optimization schemes are proposed.One is to select the optimal performance sketch based on Reinforcement lear-ning algorithm,and the other is to predict mutation rules based on machine learning models.Two optimization schemes aim to reduce the search time for the optimal scheduling scheme and quickly generate high-performance operators.To evaluate the effectiveness of the optimization plan,three models such as Resnet-50 and three operators such as conv2d are tested and evaluated.The results show that the optimized Ansor can generate target programs with the same or even better performance as before in only 70%~75% search time.Moreover,under the optimal iteration number,the inference speed of the target program can be improved by up to 5%.

Key words: Auto schedule, TVM complier, Optimizing search speed, Machine learning, Reinforcement learning, Deep learning model

CLC Number: 

  • TP311
[1]CHETLUR S,WOOLLEY C,VANDERMERSCH P,et al.cudnn:Efficient Primitives for Deep Learning[J].arXiv:1410.0759,2014.
[2]KHAN J,FULTZ P,TAMAZOV A,et al.MIOpen:An OpenSource Library for Deep Learning Primitives[J].arXiv:1910.00078,2020.
[3]LI M Z,LIU Y,LIU X Y,et al.The Deep Learning Compiler:A Comprehensive Survey[J].IEEE Transactions on Parallel and Distributed Systems,2020,32(3):708-727.
[4]XING Y,WENG J,WANG Y S,et al.An In-depth Comparison of Compilers for Deep Neural Networks on Hardware[C]//2019 IEEE International Conference on Embedded Software and Systems(ICESS).IEEE,2019:1-8.
[5]CHEN T Q,MOREAU T,JIANG Z H,et al.TVM:End-to-End Optimization Stack for Deep Learning[C]//Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation(OSDI’18).Carlsbad,USA:USENIX Association,2018:579-594.
[6]ZHAO J,LI B J,WANG N,et al.AKG:Automatic Kernel Ge-neration for Neural Processing Units Using Polyhedral Transformations[C]//Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation(PLDI 2021).New York,USA:Association for Computing Machinery,2021:1233-1248.
[7]LATTNER C,AMINI M,BONDHUGULA U,et al.MLIR:Scaling Compiler Infrastructure for Domain Specific Computation[C]//2021 IEEE/ACM International Symposium on Code Generation and Optimization(CGO).IEEE,2021:2-14.
[8]ABADI M,BARHAM P,CHEN J M,et al.Tensorflow:Large-scale Machine Learning on Heterogeneous Distributed Systems[C]//Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation(OSDI’16).USA:USENIX Association,2016:265-283.
[9]PASZKE A,GROSS S,MASSA F,et al.PyTorch:An Imperative Style,High-Performance Deep Learning Library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates Inc.,2019:8026-8037.
[10]GASKILL B.ONNX:the Open Neural Network Exchange Format[J].Linux Journal,2018,TN.285:157-161.
[11]ROESCH J,LYUBOMIRSKY S,WEBER L,et al.Relay:ANew IR for Machine Learning Frameworks[C]// Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages(MAPL 2018).New York,NY,USA:Association for Computing Machinery,2018:58-68.
[12]RAGAN-KELLEY J,BARNES C,ADAMS A,et al.Halide:A Language and Compiler For Optimizing Parallelism,Locality,And Recomputation in Image Processing Pipelines[J].ACM Sigplan Notices,2013,48(6):519-530.
[13]CHEN T Q,ZHENG L M,YAN E,et al.Learning to Optimize Tensor Programs[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems(NIPS’18).USA:Curran Associates Inc.,2018:3393-3404.
[14]ZHENG L M,JIA C F,SUN M M,et al.Ansor:Generating High-Performance Tensor Programs for Deep Learning[C]//Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation(OSDI’20).Carlsbad,CA,USA:USENIX Association,2020:863-789.
[15]WU J J.A deployment method and device for heterogeneousplatforms based on TVM compiler:CN202010654954[P].2023-12-25.
[16]CHEN T Q,GUESTRIN C.XGBoost:A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD ’16).New York,NY,USA:Association for Computing Machinery,2016:785-794.
[1] ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[2] DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan. Self-learning Star Chain Space Adaptive Allocation Method [J]. Computer Science, 2025, 52(3): 359-365.
[3] XIONG Qibing, MIAO Qiguang, YANG Tian, YUAN Benzheng, FEI Yangyang. Malicious Code Detection Method Based on Hybrid Quantum Convolutional Neural Network [J]. Computer Science, 2025, 52(3): 385-390.
[4] HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[5] ZUO Xuhong, WANG Yongquan, QIU Geping. Study on Integrated Model of Securities Illegal Margin Trading Accounts Identification Based on Trading Behavior Characteristics [J]. Computer Science, 2025, 52(2): 125-133.
[6] SHANG Qiuyan, LI Yicong, WEN Ruilin, MA Yinping, OUYANG Rongbin, FAN Chun. Two-stage Multi-factor Algorithm for Job Runtime Prediction Based on Usage Characteristics [J]. Computer Science, 2025, 52(2): 261-267.
[7] CAI Yuliang, LYU Chunhui, HE Qiang, YU Bo, CHEN Dongyue, WANG Youtong, WANG Qiang, LIU Yuxuan, ZHAO Jingjing. Fully Distributed Event Driven Bipartite Consensus Algorithm Based on Reinforcement Learning [J]. Computer Science, 2025, 52(2): 279-290.
[8] XU Donghong, LI Bin, QI Yong. Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center [J]. Computer Science, 2025, 52(2): 310-322.
[9] BAO Zepeng, QIAN Tieyun. Survey on Large Model Red Teaming [J]. Computer Science, 2025, 52(1): 34-41.
[10] WANG Qidi, SHEN Liwei, WU Tianyi. Option Discovery Method Based on Symbolic Knowledge [J]. Computer Science, 2025, 52(1): 277-288.
[11] WANG Yanning, ZHANG Fengdi, XIAO Dengmin, SUN Zhongqi. Multi-agent Pursuit Decision-making Method Based on Hybrid Imitation Learning [J]. Computer Science, 2025, 52(1): 323-330.
[12] YAN Xin, HUANG Zhiqiu, SHI Fan, XU Heng. Study on Following Car Model with Different Driving Styles Based on Proximal PolicyOptimization Algorithm [J]. Computer Science, 2024, 51(9): 223-232.
[13] WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272.
[14] ZHOU Wenhui, PENG Qinghua, XIE Lei. Study on Adaptive Cloud-Edge Collaborative Scheduling Methods for Multi-object State Perception [J]. Computer Science, 2024, 51(9): 319-330.
[15] LI Haixia, SONG Danlei, KONG Jianing, SONG Yafei, CHANG Haiyan. Evaluation of Hyperparameter Optimization Techniques for Traditional Machine Learning Models [J]. Computer Science, 2024, 51(8): 242-255.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!