计算机科学 ›› 2019, Vol. 46 ›› Issue (6A): 279-283.
吕培建1, 陈佳鹏2, 袁飞1, 彭强2, 项煜3
LV Pei-jian1, CHEN Jia-peng2, YUAN Fei1, PENG Qiang2, XIANG Yu3
摘要: 卷积神经网络的快速发展极大地提升了目标检测的性能。针对SqueezeDet算法没有利用多尺度以及上下文信息的问题,文章结合跳过连接(skip connection)和快捷连接(shortcut connection)来汇聚多尺度特征图,利用膨胀卷积(dilated convolution)来扩大卷积感受野以及上下文信息,提出了一种基于上下文的多尺度目标检测模型,提升了整个网络对复杂场景下的目标检测的精度和鲁棒性。该模型融合3种不同分辨率的特征图:将最小以及中间尺寸的特征图通过不同采样率的膨胀卷积聚集上下文信息,然后通过双线性插值的方式将最小特征图的分辨率放大一倍,最大特征图经卷积层降采样之后获得与中间特征图相同的尺寸,与之进行融合,并且使用了快捷连接来连接不同尺寸的特征图,从较大特征图中获取丢失的信息。将该模型在自动驾驶国际公开基准测试数据集KITTI中进行了实验,与SqueezeDet相比,所提算法的准确率提升约5%,同时在GPU中的推断速度可达30fps。
中图分类号:
[1]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.IEEE,2012:1097-1105. [2]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99. [3]GIRSHICK R.Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision (ICCV).IEEE,2015:1440-1448. [4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:779-788. [5]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shot multibox detector[C]∥European Conference on Computer Vision.Cham:Springer,2016:21-37. [6]WU B,IANDOLA F,JIN P H,et al.SqueezeDet:Unified, Small,Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.IEEE,2017:129-137. [7]KONG T,YAO A,CHEN Y,et al.Hypernet:Towards accurate region proposal generation and joint object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:845-853. [8]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:770-778. [9]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015. [10]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? thekitti vision benchmark suite[C]∥2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2012:3354-3361. [11]IANDOLA F N,HAN S,MOSKEWICZ M W,et al.Squeeze-Net:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[J].arXiv:1602.07360,2016. [12]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:6517-6525. [13]XIANG Y,CHOI W,LIN Y,et al.Subcategory-aware convolutional neural networks for object proposals and detection[C]∥2017 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2017:924-933. [14]CAI Z,FAN Q,FERIS R S,et al.A unified multi-scale deep convolutional neural network for fast object detection[C]∥European Conference on Computer Vision.Cham:Springer,2016:354-370. [15]ASHRAF K,WU B,IANDOLA F N,et al.Moskewicz,Kurt Keutzer:Shallow Networks for High-accuracy Road Object-detection[C]∥VEHITS.2017:33-40. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[3] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[4] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[5] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[6] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[7] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[8] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[9] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[10] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[11] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[12] | 吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039 |
[13] | 杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行. 基于步态分类辅助的虚拟IMU的行人导航方法 Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification 计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148 |
[14] | 杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236 |
[15] | 杨健楠, 张帆. 一种结合双注意力机制和层次网络结构的细碎农作物分类方法 Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure 计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169 |
|