基于FPGA的CNN图像识别加速与优化

doi:10.11896/jsjkx.200600089

Computer Science ›› 2021, Vol. 48 ›› Issue (4): 205-212.doi: 10.11896/jsjkx.200600089

• Computer Graphics & Multimedia • Previous Articles Next Articles

FPGA-based CNN Image Recognition Acceleration and Optimization

QI Yan-rong¹, ZHOU Xia-bing², LI Bin¹, ZHOU Qing-lei¹

1 School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China
2 School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China

Received:2020-06-24 Revised:2020-08-13 Online:2021-04-15 Published:2021-04-09
About author:QI Yan-rong,born in 1995,postgra-duate.Her main research interests include image processing and high-performance computing.(17319793885@163.com)
LI Bin,born in 1986,Ph.D,associate professor.His main research interests include information security and high performance computing.
Supported by:
National Key R&D Program “Public Safety Risk Prevention and Control and Emergency Technology Assembly” Key Special Project (2018XXXXXXX01) and National Natural Science Foundation of China(61702518).

Abstract

Abstract: Currently,CNN has been widely used in many application scenarios,including image classification,speech recognition,video analysis,document analysis,etc.Because CNN is computationally intensive,it is often accelerated with GPUs.However,GPU has a high power consumption and is not suitable for CNN inference stage.Based on this,this paper studies the application method of FPGA-based CNN image recognition acceleration and optimization.The OpenCL SDK provided by Intel FPGA is used to design and optimize the CNN forward model on the FPGA board.First of all,for the calculation problem,through the division of functional modules,the advantages of FPGA’s high computing efficiency are fully utilized.Secondly,this paper optimizes the core algorithm to improve the running speed,analyzes the feature map processing operations,uses the parameter sharing strategy to reduce the amount of data storage,uses the pipeline to transfer data,and reduce the number of accesses to off-chip storage.Finally,it optimizes the design of data cache,data flow and loop to alleviate the on-chip resource constraints of FPGA,quantizes the parameters and reduce the amount of FPGA memory resources occupied.Experimental results show that FPGA has lower power consumption,CPU power consumption is 2.1 times that of FPGA,and GPU power consumption is 6.5 times that of FPGA.Compared with the methods proposed in the literature of related fields in recent years,the proposed method has higher throughput and computational performance.

Key words: CNN, Data flow optimization, FPGA, Image recognition, Module division, OpenCL

CLC Number:

TP391

QI Yan-rong, ZHOU Xia-bing, LI Bin, ZHOU Qing-lei. FPGA-based CNN Image Recognition Acceleration and Optimization[J].Computer Science, 2021, 48(4): 205-212.

References

[1]ZHOU F Y,JIN L F,DONG J.A review of convolutional neural network research[J].Journal of Computer Science,2017,40(6):1229-1251.
[2]WU Y X,LIANG K,LIU Y,et al.Progress and Trend of DeepLearning FPGA Accelerator[J].Chinese Journal of Computers,2019,42(11):2461-2480.
[3]AYDONAT U,O'CONNELL S,CAPALIJA D,et al.An opencl deep learning accelerator on arria 10[J].arXiv:1701.03534v1,2017.
[4]QIU J,WANG J,YAO S,et al.Going deeper with embedded FPGA platform for convolutional neural network[C]//Acm/Sigda International Symposium on Field-programmable Gate Arrays.2016:26-35.
[5]WANG C,GONG L,YU Q,et al.DLAU:A Scalable DeepLearning Accelerator Unit on FPGA[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2017,36(3):513-517.
[6]WANG D,XU K,JIANG D.PipeCNN:An OpenCL-based open-source FPGA accelerator for convolution neural networks[C]//2017 International Conference on Field Programmable Techno-logy(ICFPT).Melbourne,VIC,2017:279-282.
[7]WANG D,AN J J,XU K.PipeCNN:An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks[J].arXiv:1611.02450v1,2016.
[8]ABDELOUAHAB K,PELCAT M,SÉROT J,et al.Tactics to Directly Map CNN Graphs on Embedded FPGAs[J].IEEE Embedded Systems Letters,2017,9(4):113-116.
[9]WEI X C.Automated systolic array architecture synthesis forhigh throughput CNN inference on FPGAs[C]//2017 54th ACM/EDAC/IEEE Design Automation Conference(DAC).Austin,TX,2017:1-6.
[10]WANG Y,ZHOU H Y,FENG H,et al.Network traffic classification method based on deep convolutional neural network [J].Journal of Communications,2018,39(1):14-23.
[11]LU Y,CHEN Y,LI T,et al.Construction method of embedded FPGA convolutional neural network for edge computing[J].Computer Research and Development,2018,55(3):551-562.
[12]ZHOU Y M,JIANG J F.An FPGA-based accelerator implementation for deep convolutional neural networks[C]//2015 4th International Conference on Computer Science and Network Technology(ICCSNT).Harbin,2015:829-832.
[13]ZHANG C,LI P,SUN J,et al.Optimizing FPGA-based accele-rator design for deep convolutional neural networks[C]//Proc.ACM/SIGDA Int.Symp.Field Program.Gate Arrays.2015:161-170.
[14]JIAN Q,ZHANG P Y,WANG X J.A configurable CNN co-accelerator FPGA implementation method[J].Acta Electronica Sinica,2019,47(7):1525-1531.
[15]CHAKRADHAR S,SANKARADAS M,JAKKULA V,et al.A dynamically configurable coprocessor for convolutional neural networks[C]//Proc.ACM SIGARCH Comput.2010:247-257.
[16]GOKHALE V,JIN J,DUNDAR A,et al.A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.Columbus,OH,2014:696-701.
[17]SUDA N.Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks[C]//Proc.ACM/SIGDA Int.Symp.Field Program.2016:16-25.
[18]LU L,LIANG Y,XIAO Q,et al.Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs[C]//2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM).Napa,CA,2017:101-108.
[19]HAN X,ZHOU D,WANG S,et al.CNN-MERP:An FPGA-based memory-efficient reconfigurable processor for forward and backward propagation of convolutional neural networks[C]//2016 IEEE 34th International Conference on Computer Design(ICCD).Scottsdale,AZ,2016:320-327.

Related Articles 15

[1]	JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[2]	ZHU Wen-tao, LAN Xian-chao, LUO Huan-lin, YUE Bing, WANG Yang. Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN [J]. Computer Science, 2022, 49(6A): 378-383.
[3]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[4]	YUE Qing, YIN Jian-yu, WANG Sheng-sheng. Automatic Detection of Pulmonary Nodules in Low-dose CT Images Based on Improved CNN [J]. Computer Science, 2022, 49(6A): 54-59.
[5]	YU Ben-gong, ZHANG Zi-wei, WANG Hui-ling. TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information [J]. Computer Science, 2022, 49(6A): 165-171.
[6]	WANG Shan, XU Chu-yi, SHI Chun-xiang, ZHANG Ying. Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM [J]. Computer Science, 2022, 49(6A): 675-679.
[7]	ZHAO Zheng-peng, LI Jun-gang, PU Yuan-yuan. Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network [J]. Computer Science, 2022, 49(6): 199-209.
[8]	ZHAO Xiao-hu, YE Sheng, LI Xiao. Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction [J]. Computer Science, 2022, 49(6): 269-275.
[9]	FANG Zhong-li, WANG Zhe, CHI Zi-qiu. Dual-stream Reconstruction Network for Multi-label and Few-shot Learning [J]. Computer Science, 2022, 49(1): 212-218.
[10]	HUANG Xiao-sheng, XU Jing. Multi-focus Image Fusion Method Based on PCANet in NSST Domain [J]. Computer Science, 2021, 48(9): 181-186.
[11]	XING Hao, LI Ming. Deepfake Video Detection Based on 3D Convolutional Neural Networks [J]. Computer Science, 2021, 48(7): 86-92.
[12]	CUI Wen-hao, JIANG Mu-rong, YANG Lei, FU Peng-ming, ZHU Ling-xiao. Combining MCycleGAN and RFCNN to Realize High Resolution Reconstruction of Solar Speckle Image [J]. Computer Science, 2021, 48(6A): 38-42.
[13]	XIONG Zhao-yang, WANG Ting. Image Recognition for Building Components Based on Convolutional Neural Network [J]. Computer Science, 2021, 48(6A): 51-56.
[14]	LIU Han-qing, KANG Xiao-dong, LI Bo, ZHANG Hua-li, FENG Ji-chao, HAN Jun-ling. Comparative Study on Classification and Recognition of Medical Images Using Deep Learning Network [J]. Computer Science, 2021, 48(6A): 89-94.
[15]	HAN Bin, ZENG Song-wei. Plant Leaf Image Recognition Based on Multi-feature Integration and Convolutional Neural Network [J]. Computer Science, 2021, 48(6A): 113-117.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

FPGA-based CNN Image Recognition Acceleration and Optimization

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0