面向边缘计算的轻量级网络硬件加速设计

doi:10.11896/jsjkx.220800045

Abstract

Abstract: With the increase of edge device data and the continuous application of neural networks,the rise of edge computing has shared the pressure on big data technologies with cloud computing as the core.Field programmable gate arrays(FPGAs) have shown excellent properties in edge computing and building neural network accelerators due to their flexible architecture and low power consumption.But traditional FPGA solutions based on traditional convolution algorithms are often limited by the number of on-chip computing units.In this paper,Zynq is used as a hardware acceleration platform,to quantize parameters at a fixed point,and array partitioning is used to improve pipeline running speed.The Winograd fast convolution algorithm is used to improve the traditional convolution,and the multiplication operation in the convolution operation is converted into an addition operation,which reduces the computational complexity of the model.The computational performance of the designed accelerator is greatly improved.Experiments show that XC7Z035 can achieve 43.5GOP/s performance under 150 MHz clock,and the energy efficiency is 129 times of Xeon(R) Silver 4214R and 159 timesof dual-core ARM.The proposedsolution is limited in resources and power consumption.It can provide high performance and is suitable for the landing application of lightweight neural networks at the edge of the network.

Key words: Edge computing, Hardware acceleration, Lightweight convolutional neural networks, Winograd, FPGA

CLC Number:

TP391

YU Yunjun, ZHANG Pengfei, GONG Hancheng, CHEN Min. Lightweight Network Hardware Acceleration Design for Edge Computing[J].Computer Science, 2023, 50(11A): 220800045-7.

References

[1]GIRSHICK R,DONAHUE J,DARRELL T et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition 2014:580-587.
[2]LECUN Y,BOTTOU L.Gradient-based learning applied to document recognition[C]//Proceedings of the IEEE.1998:2278-2324.
[3]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[C]//NIPS 2012.2012.
[4]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1506,2014.
[5]SZEGEDY C,LIU W,JIA Y et al.Going Deeper with Convolutions[C]//CVPR.2015:1-9.
[6]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:770-778.
[7]LU Z,CHEN Y,LI T,et al.Convolutional Neural NetworkConstruction Method for Embedded FPGAs Oriented Edge Computing[J].Compute Research Develope,2018,55:12.
[8]COATES A,HUVAL B,WANG T,et al.Deep learning withCOTS HPC systems[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning.JMLR.org,2013:1337-1345.
[9]JOUPPI N P,YOUNG C,PATIL N,et al.In-Datacenter Per-formance Analysis of a Tensor Processing Unit[C]//the 44th Annual International Symposium,2017:1-12.
[10]AMIRI M,SIDDIQUI F M,KELLY C,et al.FPGA-Based Soft-Core Processors for Image Processing Applications[J].Journal of Signal Processing Systems,2017,87:139-156.
[11]SZE V,CHEN Y H,YANG T J,et al.Efficient Processing of Deep Neural Networks:A Tutorial and Survey[C]//Proceedings of the IEEE.2017.
[12]YANG T J,CHEN Y H,SZE V.Designing Energy-EfficientConvolutional Neural Networks using Energy-Aware Pruning[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017.
[13]LU L Q,ZHENG S Z,XIAO Q C,et al.Accelerating convolutional neural networks on FPGAs[J].Science China Information Sciences,2019,49:277-294.
[14]LIANG Y,LU L,XIAO Q,et al.Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2020,39:857-870.
[15]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//CVPR.IEEE,2016:779-788.
[16]TAIGMAN Y,YANG M,RANZATO M,et al.DeepFace:Closing the Gap to Human-Level Performance in Face Verification[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:1701-1708.
[17]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:3431-3440.
[18]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39:1137-1149.
[19]HAN S,POOL J,TRAN J,et al.Learning both Weights and Connections for Efficient Neural Networks[J].Advances in Neural Information Processing Systems,2015,28:1135-1143.
[20]ZHANG Q,ZHANG M,CHEN T,et al.Recent Advances in Convolutional Neural Network Acceleration[J].Neurocomputing,2019,323:37-51.
[21]YOUNG S,WANG Z,TAUBMAN D,et al.Transform Quantization for CNN Compression[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44:5700-5714.
[22]CHEN J,LIU L,LIU Y,et al.A Learning Framework for 〈italic〉n〈/ita-lic〉-Bit Quantized Neural Networks Toward FPGAs[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32:1067-1081.
[23]AYACHI R,SAID Y,BEN ABDELALI A.Optimizing Neural Networks for Efficient FPGA Implementation:A Survey[J].Archives of Computational Methods in Engineering,2021,28:4537-4547.
[24]CHEN Y R,WANG Y T.A survey of architectures of neural network accelerators[J].Science China Information Sciences,2022,52:16.
[25]CHAKRADHAR S T,SANKARADASS M,JAKKULA V,et al.A dynamically configurable coprocessor for convolutional neural networks[C]//37th International Symposium on Computer Architecture(ISCA 2010).Saint-Malo,France,2010.
[26]ZHAO R,NIU X,WU Y,et al.Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms[C]//13th International Symposium on Applied Reconfigurable Computing(ARC).2017:255-267.
[27]GOKHALE V,JIN J,DUNDAR A,et al.A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.2014:696-701.

Related Articles 15

[1]	LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[2]	LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[3]	ZHANG Naixin, CHEN Xiaorui, LI An, YANG Leyao, WU Huaming. Edge Offloading Framework for D2D-MEC Networks Based on Deep Reinforcement Learningand Wireless Charging Technology [J]. Computer Science, 2023, 50(8): 233-242.
[4]	CHEN Xuzhan, LIN Bing, CHEN Xing. Stackelberg Model Based Distributed Pricing and Computation Offloading in Mobile Edge Computing [J]. Computer Science, 2023, 50(7): 278-285.
[5]	FU Xiong, FANG Lei, WANG Junchang. Edge Server Placement for Energy Consumption and Load Balancing [J]. Computer Science, 2023, 50(6A): 220300088-5.
[6]	LEI Xuemei, LIU Li, WANG Qian. MEC Offloading Model Based on Linear Programming Relaxation [J]. Computer Science, 2023, 50(6A): 211200229-5.
[7]	CHEN Che, ZHENG Yifeng, YANG Jingmin, YANG Liwei, ZHANG Wenjie. Dynamic Energy Optimization Strategy Based on Relay Selection and Queue Stability [J]. Computer Science, 2023, 50(6A): 220100082-8.
[8]	GAO Lixue, CHEN Xin, YIN Bo. Task Offloading Strategy Based on Game Theory in 6G Overlapping Area [J]. Computer Science, 2023, 50(5): 302-312.
[9]	Peng XU, Jianxin ZHAO, Chi Harold LIU. Optimization and Deployment of Memory-Intensive Operations in Deep Learning Model on Edge [J]. Computer Science, 2023, 50(2): 3-12.
[10]	CHEN Yipeng, YANG Zhe, GU Fei, ZHAO Lei. Resource Allocation Strategy Based on Game Theory in Mobile Edge Computing [J]. Computer Science, 2023, 50(2): 32-41.
[11]	ZHENG Hongqiang, ZHANG Jianshan, CHEN Xing. Deployment Optimization and Computing Offloading of Space-Air-Ground Integrated Mobile Edge Computing System [J]. Computer Science, 2023, 50(2): 69-79.
[12]	ZHAO Hongwei, YOU Jingyue, WANG Yangyang, ZHAO Xike. Dynamic Unloading Strategy of Vehicle Edge Computing Tasks Based on Traffic Density [J]. Computer Science, 2023, 50(11A): 220900199-7.
[13]	XUE Jianbin, WANG Hainiu, GUAN Xiangrui, YU Bowen. Study on Dynamic Task Offloading Scheme Based on MAB in Vehicular Edge Computing Network [J]. Computer Science, 2023, 50(11A): 230200186-9.
[14]	WU Chun, CHEN Long, SUN Yifei, WU Jigang. Fairness-aware Service Caching and Task Offloading with Cooperative Mobile Edge Computing [J]. Computer Science, 2023, 50(11A): 230200095-8.
[15]	WENG Jie, LIN Bing, CHEN Xing. Multi-edge Server Load Balancing Strategy Based on Game Theory [J]. Computer Science, 2023, 50(11A): 221200150-8.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Lightweight Network Hardware Acceleration Design for Edge Computing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0