计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 211-217.doi: 10.11896/jsjkx.201200019
张侣, 周博文, 吴亮红
ZHANG Lyu, ZHOU Bo-wen, WU Liang-hong
摘要: SSD(Single Shot Multibox Detector) 是一种基于卷积神经网络的单阶检测算法,相比双阶检测算法,它在保证一定精度的同时显著提高了检测速度,但仍难以满足很多实际应用,尤其是在小目标检测任务中,检测精度更是难以满足需求。针对该不足,文中提出了一种基于改进残差结构与卷积注意力模块的特征提取网络Res-Am CNN (Residual with Attention Module Convolutional Neural Networks),大幅提高了网络的特征提取能力,并在原始SSD金字塔结构中引入上采样加法融合 (Additive Fusion with Upsample,AFU) 来进行特征融合,增强了浅层特征的表征能力。在 PASCAL VOC数据集上的实验结果表明,相比原始SSD网络和主流的检测网络,Res-Am&AFU SSD (SSD with Res-Am CNN and AFU) 网络在VOC测试集上的平均精度均值(mean Average Precision,mAP) 达到69.1%,在精度上领先单阶网络,接近双阶网络,在检测速度上远快于双阶网络。在小目标测试集上的实验结果表明,Res-Am&AFU SSD网络的mAP为67.2%,比原始SSD提高了9.4%,且该方法具有更加灵活、无需预训练等优点。
中图分类号:
[1]VIOLA P,JONES M.Robust real-time face detection[J].International Journal of Computer Vision,2004,57(2):137-154. [2]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.San Diego,2005:886-893. [3]FELZENSZWALB P,MCALLESTER D,RAMANAN D,et al.A discriminatively trained,multiscale,deformable part model[C]//Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition.Anchorage,2008:1-8. [4]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Compu-ter Vision and Pattern Recognition.Columbus,2014:580-587. [5]GIRSHICK R.Fast R-CNN[C]//Proceedings of the 2015 IEEEInternational Conference on Computer Vision.Santiago,2015:1440-1448. [6]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towardsreal-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [7]HE K,GEORGIA G,PIOTR D,et al.Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference onCompu-ter Vision.Venice,2017:2980-2988. [8]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,2016:779-788. [9]HEI L,JIA D.CornerNet:Detecting objects as paired keypoints[J].International Journal of Computer Vision,2020,128(2):734-750. [10]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Computer Vision-ECCV 2016.Cham,2016:21-37. [11]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations.San Diego,2015:1-14. [12]GIMPEL K,SMITH N A.Softmax-Margin CRFs:TrainingLog-Linear Models with Cost Functions[C]//Proceedings of the North American Chapter for the Association for Computational Linguistics.Los Angeles,2010:733-736. [13]NESTEROV Y.Smooth minimization of non-smooth functions[J].Mathematical Programming,2005,103(1):127-152. [14]PAN M Y,SONG H H,ZHANG K H,et al.Learning Global Guided Progressive Feature Aggregation Lightweight Network for Salient Object Detection[J].Computer Science,2021,48(6):103-109. [15]TONG Z,TANAKA G.Hybrid pooling for enhancement of ge-neralization ability in deep convolutional neural networks[J].Neurocomputing,2019,333(14):76-85. [16]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional BlockAttention Module[C]//Proceedings of the 2018 European Conference on Computer Vision.2018:3-19. [17]YUAN Y,HE X G,ZHU D K,et al.Survey of Visual Image Sa-liency Detection[J].Computer Science,2020,47(7):84-91. [18]ZENG Q G,LI X R,LIN H T.Concat Convolutional NeuralNetwork for pulsar candidate selection[J].Monthly Notices of the Royal Astronomical Society,2020,494(3):3110-3119. [19]WANG X L,LI X.Target Tracking Algorithm Based on Correlated Filters and Convolutional Neural Network[J].Journal of Chongqing Technology and Business University (Natural Science Edition),2020,37(1):19-24. [20]ZHANG H,WU G,LING Q.Distributed stochastic gradient descent for link prediction in signed social networks[J].EURASIP Journal on Advances in Signal Processing,2019,2019(1):1-11. [21]ZHU Y,MA C,DU J.Rotated cascade R-CNN:A shape robust detector with coordinate regression[J].Pattern Recognition,2019,96(1):106964-106975. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[3] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[4] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[5] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[6] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[7] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[8] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[9] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[10] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[11] | 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉. 基于边框距离度量的增量目标检测方法 Incremental Object Detection Method Based on Border Distance Measurement 计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132 |
[12] | 王灿, 刘永坚, 解庆, 马艳春. 基于软标签和样本权重优化的Anchor Free目标检测算法 Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization 计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240 |
[13] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[14] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[15] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
|