Computer Science ›› 2025, Vol. 52 ›› Issue (2): 299-309.doi: 10.11896/jsjkx.240900101

• Computer Network • Previous Articles     Next Articles

Adaptive Operator Parallel Partitioning Method for Heterogeneous Embedded Chips in AIoT

LIN Zheng1, LIU Sicong1, GUO Bin1, DING Yasan1, YU Zhiwen1,2   

  1. 1 College of Computer Science,Northwestern Polytechnical University,Xi'an 710072,China
    2 Harbin Engineering University,Harbin 150001,China
  • Received:2024-09-16 Revised:2024-11-02 Online:2025-02-15 Published:2025-02-17
  • About author:LIN Zheng,born in 2002,postgraduate.His main research interests include ubiquitous computing,and mobile crowd sensing.
    GUO Bin,born in 1980,Ph.D,Ph.D supervisor,is a member of CCF(No.E200019107S).His main research interests include ubiquitous computing,and mobile crowd sensing.
  • Supported by:
    National Science Fund for Distinguished Young Scholars of China(62025205) and National Natural Science Foundation of China(62032020, 62302017).

Abstract: With the continuous improvement of people's quality of life and the rapid development of technology,mobile devices such as smartphones have achieved widespread popularity globally.Against this backdrop,the deployment and application of deep neural networks on mobile devices have become a research hotspot.Deep neural networks not only drive significant progress in the field of mobile applications,but also pose higher requirements for energy efficiency management of battery-powered mobile devices.Meanwhile,the rise of heterogeneous processors in today's mobile devices brings new challenges to energy efficiency optimization.The allocation of computing tasks among different processors to achieve parallel processing and acceleration of deep neural networks does not necessarily optimize energy consumption and may even increase it.To address this issue,this paper proposes an energy-efficient adaptive parallel computing scheduling system for deep neural networks.This system comprises a runtime energy consumption analyzer and an online operator partitioning executor,which can dynamically adjust operator allocation based on dynamic device conditions,ensuring optimized energy efficiency for computing on heterogeneous processors of mobile devices while maintaining high responsiveness.Experimental results demonstrate that compared to baseline methods,the system designed in this paper reduces average energy consumption and latency by 5.19% and 9.0% respectively,and the maximum energy consumption and latency are reduced by 18.35% and 21.6% on mobile device deep neural networks.

Key words: Deep neural networks, Mobile device, Energy efficiency optimization, Heterogeneous processors, Energy consumption prediction

CLC Number: 

  • TP391
[1]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[2]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[3]HAN S,LIU X,DALLY W J.Efficient Methods and Hardware for Deep Learning[J].Proceedings of the IEEE,2017,105(12):2295-2329.
[4]MITTAL S.A Survey of Techniques for Architecting Processor Components Using Domain-Specific Hardware Accelerators[J].Sustainable Computing:Informatics and Systems,2016,9:17-26.
[5]HOROWITZ M.Computing's energy problem(and what we can do about it)[C]//2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers(ISSCC).2014:10-14.
[6]CHEN Y,LI T,CHEN Y,et al.Understanding the energy efficiency of deep learning models:An experimental study[C]//Proceedings of the 2018 IEEE International Symposium on Workload Characterization.2018:11-21.
[7]YANG T J,CHEN Y H,EMER J,et al.A method to estimate the energy consumption of deep neural networks[C]//Procee-dings of the 51st Asilomar Conference on Signals,Systems,and Computers.Pacific Grove,USA:2017:1916-1920.
[8]RODRIGUES C F,RILEY G,LUJÁN M.SyNERGY:An energy measurement and prediction framework for Convolutional Neural Networks on Jetson TX1[C]//Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA).The Steering Committee of The World Congress in Computer Science,Computer Engineering and Applied Computing(WorldComp),2018:375-382.
[9]RODRIGUES C F,RILEY G,LUJÁN M.Energy predictivemodels for convolutional neural networks on mobile platforms[J].arXiv:2004.05137,2020.
[10]DAI X,ZHANG P,WU B,et al.Chamnet:Towards efficient network design through platform-aware model adaptation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:11398-11407.
[11]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015.
[12]ROMERO A,BALLAS N,KAHOU S E,et al.FitNets:Hints for thin deep nets[J].arXiv:1412.6550,2015.
[13]HAN S,POOL J,TRAN J,et al.Learning both weights andconnections for efficient neural networks[C]//Advances in Neural Information Processing Systems.2015:1135-1143.
[14]MOLCHANOV P,TYREE S,KARRAS T,et al.Pruning convolutional neural networks for resource efficient inference[J].arXiv:1611.06440,2017.
[15]COURBARIAUX M,BENGIO Y,DAVID J.Binary Connect:Training deep neural networks with binary weights during propagations[C]//Advances in Neural Information Processing Systems.2015:3123-3131.
[16]SAINATH T N,KINGSBURY B,SINDHWANI V,et al.Low-rank matrix factorization for deep neural network training with high-dimensional output targets[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing.2013:6655-6659.
[17]DENTON E L,ZAREMBA W,BRUNA J,et al.Exploiting linear structure within convolutional networks for efficient evaluation[C]//Advances in Neural Information Processing Systems.2014:1269-1277.
[18]ZHANG X,LI H,HE Y,et al.Co-optimizing performance and energy efficiency of DAG scheduling on heterogeneous systems[C]//Proceedings of the 2018 IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.2018:381-390.
[19]LI X,MA H,CHEN Y.Optimizing operator parallelism forgraph neural networks on GPUs[C]//Proceedings of the 2019 IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques.2019:303-315.
[20]WEI J,CAO T,CAO S,et al.NN-Stretch:Automatic NeuralNetwork Branching for Parallel Inference on Heterogeneous Multi-Processors[C]//Proceedings of the 21st Annual International Conference on Mobile Systems,Applications and Ser-vices.2023:70-83.
[21]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[22]CHENG Y,WANG D,ZHOU P,et al.Model compression and acceleration for deep neural networks:The principles,progress,and challenges[J].IEEE Signal Processing Magazine,2018,35(1):126-136.
[23]JIA F,ZHANG D,CAO T,et al.CoDL:efficient CPU-GPU co-execution for deep learning inference on mobile devices[C]//MobiSys.2022:209-221.
[1] XIA Zhuoqun, ZHOU Zihao, DENG Bin, KANG Chen. Security Situation Assessment Method for Intelligent Water Resources Network Based on ImprovedD-S Evidence [J]. Computer Science, 2025, 52(6A): 240600051-6.
[2] MA Zhaoyang, CHEN Juan, ZHOU Yichang, WU Xianyu, GAO Pengfei, RUAN Wenhao, ZHAN Haoming. TS3:Energy-Efficiency-First Optimal Thread Number Search Algorithm Based on Specific Starting Point Classification [J]. Computer Science, 2025, 52(5): 67-75.
[3] ZHU Fukun, TENG Zhen, SHAO Wenze, GE Qi, SUN Yubao. Semantic-guided Neural Network Critical Data Routing Path [J]. Computer Science, 2024, 51(9): 155-161.
[4] HAN Bing, DENG Lixiang, ZHENG Yi, REN Shuang. Survey of 3D Point Clouds Upsampling Methods [J]. Computer Science, 2024, 51(7): 167-196.
[5] XU Xiaohua, ZHOU Zhangbing, HU Zhongxu, LIN Shixun, YU Zhenjie. Lightweight Deep Neural Network Models for Edge Intelligence:A Survey [J]. Computer Science, 2024, 51(7): 257-271.
[6] LI Wenting, XIAO Rong, YANG Xiao. Improving Transferability of Adversarial Samples Through Laplacian Smoothing Gradient [J]. Computer Science, 2024, 51(6A): 230800025-6.
[7] ZHANG Lei, WU Wenzhe, BAI Xueyuan. Go Chessboard Recognition Based on Light-YOLOv8 [J]. Computer Science, 2024, 51(11A): 230900037-7.
[8] REN Shuyao, SONG Jiangling, ZHANG Rui. Early Screening Method for Depression Based on EEG Signal [J]. Computer Science, 2023, 50(11A): 221100139-6.
[9] GAO Yan-lu, XU Yuan, ZHU Qun-xiong. Predicting Electric Energy Consumption Using Sandwich Structure of Attention in Double -LSTM [J]. Computer Science, 2022, 49(3): 269-275.
[10] QIAN Dong-wei, CUI Yang-guang, WEI Tong-quan. Secondary Modeling of Pollutant Concentration Prediction Based on Deep Neural Networks with Federal Learning [J]. Computer Science, 2022, 49(11A): 211200084-5.
[11] FAN Hong-jie, LI Xue-dong, YE Song-tao. Aided Disease Diagnosis Method for EMR Semantic Analysis [J]. Computer Science, 2022, 49(1): 153-158.
[12] ZHOU Xin, LIU Shuo-di, PAN Wei, CHEN Yuan-yuan. Vehicle Color Recognition in Natural Traffic Scene [J]. Computer Science, 2021, 48(6A): 15-20.
[13] ZHENG Zhe, HU Qing-hao, LIU Qing-shan, LENG Cong. Quantizing Weights and Activations in Generative Adversarial Networks [J]. Computer Science, 2020, 47(5): 144-148.
[14] XIAO Rui, JIANG Jia-qi, ZHANG Yun-chun. Study on Semantic Topology and Supervised Word Sense Disambiguation of Polysemous Words [J]. Computer Science, 2019, 46(11A): 13-18.
[15] HAN Jia-lin, WANG Qi-qi, YANG Guo-wei, CHEN Jun, WANG Yi-zhong. SSD Network Compression Fusing Weight and Filter Pruning [J]. Computer Science, 2019, 46(11): 272-276.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!