基于上下文以及多尺度信息融合的目标检测算法

Abstract

Abstract: Recent advances in convolutional neural networks(CNNs) have led to significant improvement in object detection.To solve the problem of missing context and multi-scale information of SqueezeDet algorithm,this paper combines skip connection and shortcut connection to aggregate multi-scale feature maps,and use dilated convolution to expand the convolutional receptive field and context.A context-based multi-scale object detection model was proposed to effectively improve the accuracy and robustness of object detection for complex scenes.This model fuses three different resolution feature maps:the minimum and middle size feature maps gather context through dilated convolution,the minimum size feature maps are doubled through bilinear interpolation and the maximum size feature maps use convolution whose stride is 2 to down-sample.Then the three feature maps have the same size and can be fused.In addition,this paper uses shortcut connection to connect different size of feature maps to obtain lost information from the larger feature maps.The model is evaluated on the autopilot international benchmark dataset KITTI and achieves 6% improvement compare to the SqueezeDet.The speed of the model reach 30fps on a GPU.

Key words: Convolutional neural network, Dilated convolution, Shortcut connection, Skip connection

CLC Number:

TP391

LV Pei-jian, CHEN Jia-peng, YUAN Fei, PENG Qiang, XIANG Yu. Object Detection Algorithm Based on Context and Multi-scale Information Fusion[J].Computer Science, 2019, 46(6A): 279-283.

References

[1]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.IEEE,2012:1097-1105.
[2]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99.
[3]GIRSHICK R.Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision (ICCV).IEEE,2015:1440-1448.
[4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:779-788.
[5]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shot multibox detector[C]∥European Conference on Computer Vision.Cham:Springer,2016:21-37.
[6]WU B,IANDOLA F,JIN P H,et al.SqueezeDet:Unified, Small,Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.IEEE,2017:129-137.
[7]KONG T,YAO A,CHEN Y,et al.Hypernet:Towards accurate region proposal generation and joint object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:845-853.
[8]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:770-778.
[9]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
[10]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? thekitti vision benchmark suite[C]∥2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2012:3354-3361.
[11]IANDOLA F N,HAN S,MOSKEWICZ M W,et al.Squeeze-Net:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[J].arXiv:1602.07360,2016.
[12]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:6517-6525.
[13]XIANG Y,CHOI W,LIN Y,et al.Subcategory-aware convolutional neural networks for object proposals and detection[C]∥2017 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2017:924-933.
[14]CAI Z,FAN Q,FERIS R S,et al.A unified multi-scale deep convolutional neural network for fast object detection[C]∥European Conference on Computer Vision.Cham:Springer,2016:354-370.
[15]ASHRAF K,WU B,IANDOLA F N,et al.Moskewicz,Kurt Keutzer:Shallow Networks for High-accuracy Road Object-detection[C]∥VEHITS.2017:33-40.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[5]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[6]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[7]	WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[8]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[9]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[10]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[11]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[12]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[13]	ZHAO Zheng-peng, LI Jun-gang, PU Yuan-yuan. Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network [J]. Computer Science, 2022, 49(6): 199-209.
[14]	ZHANG Wen-xuan, WU Qin. Fine-grained Image Classification Based on Multi-branch Attention-augmentation [J]. Computer Science, 2022, 49(5): 105-112.
[15]	ZHAO Ren-xing, XU Pin-jie, LIU Yao. ECG-based Atrial Fibrillation Detection Based on Deep Convolutional Residual Neural Network [J]. Computer Science, 2022, 49(5): 186-193.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Object Detection Algorithm Based on Context and Multi-scale Information Fusion

PDF (PC)