改进型FCOS目标检测算法

doi:10.11896/jsjkx.210900220

摘要/Abstract

摘要： 针对经典无锚框目标检测算法FCOS( Fully Constitutional One-Stage Object Detection)难以充分提取目标特征,位置与内容信息结合能力不足,正负样本区分不充分导致性能减弱等问题,提出了一种改进型FCOS目标检测算法。该方法首先在ResNet50特征提取网络中加入可变形卷积模块与全局注意力模块,提高特征信息捕获能力;然后,将FPN特征金字塔与深层链路层相结合,构成多尺度特征融合模块,提升特征提取效果。最后,加入自适应划分正负样本模块,增强检验框的准确性以达到提高回归精度的效果,从而提升检测结果。为了测试算法的检测效果,分别使用了COCO数据集与VOC数据集进行实验。与原FCOS算法相比,所提算法在两个数据集上的平均精度分别提高了2.3%和1.8%,其中,对COCO数据集中的小目标检测的效果有明显提升。

关键词: 目标检测, 可变形卷积, 全局注意力, 多尺度特征, 特征金字塔, 正负样本

Abstract: An enhanced FCOS object detection algorithm is proposed to address the problems that the classical anchorless frame object detection algorithm FCOS(fully constitutional one-stage object detection) has difficulty in extracting target information,insufficient ability to combine location and content information,and weak performance due to insufficient differentiation between positive and negative sample.The method first adds a deformable convolution module and a global attention module to the ResNet50 feature extraction network to improve the feature information capture capability.Then,the FPN feature pyramid is combined with the deep link layer to form a multi-scale feature fusion module to improve the feature extraction effect.Finally,the adaptive division of positive and negative samples module is added to enhance the accuracy of the test frame to achieve the effect of improving the regression accuracy.In order to test the detection effect of the algorithm,the COCO dataset and VOC dataset are used for experiments.Compared with the original FCOS algorithm,the average accuracy of the proposed algorithm on the two datasets improves by 2.3% and 1.8%,respectively.Among them,there is a significant improvement for the detection of small targets in the COCO dataset.

Key words: Target detection, Deformable convolution, Global attention, Multi-scale features, Feature pyramid, Positive and negative samples

中图分类号:

TP391.41

陈金令, 程茂凯, 徐紫涵. 改进型FCOS目标检测算法[J]. 计算机科学, 2022, 49(11A): 210900220-6. https://doi.org/10.11896/jsjkx.210900220

CHEN Jin-ling, CHENG Mao-kai, XU Zi-han. Improved FCOS Target Detection Algorithm[J]. Computer Science, 2022, 49(11A): 210900220-6. https://doi.org/10.11896/jsjkx.210900220

参考文献

[1]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2017:6517-6525.
[2]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[3]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single ShotMultiBox Detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[4]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2020,42(2):318-327.
[5]REN S Q,HE K M,GIRSHICK,et al.FASTER R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[6]TIAN Z,SHEN C,CHEN H,et al.FCOS:A simple and strong anchor-free object detector[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(4):1922-1933.
[7]ZHOU X,WANG D,KRHENBÜHL P.Objects as Points[J].arXiv:1904.07850,2019.
[8]LAW H,DENG J.CornerNet:Detecting Objects as Paired Keypoints[J].International Journal of Computer Vision,2020,128(3):642-656.
[9]YANG Z,LIU S,HU H,et al.RepPoints:Point Set Representation for Object Detection[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE,2019.
[10]LONG J,SHELHAMER E,DARRELL T.Fully Convolutional Networks for Semantic Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651.
[11]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017.
[12]SRINIVAS A,LIN T Y,PARMAR N,et al.Bottleneck Transformers for Visual Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:16519-16529.
[13]HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[14]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems(NIPS’17).2017:6000-6010.
[15]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[16]LIN T Y,MAIRE M,BELONGIES,et al.Microsoft COCO:common objects in context [C]//European Conference on Computer Vision(ECCV).Cham:Springer,2014:745-755.
[17]EVERINGHAM M,VAN G L,WILLIAMS C K I,et al.Thepascal visual object classes(VOC) challenge [J].International Journal of Computer Vision,2010 88(2):303-338.
[18]KANTOR P B.Foundations of Statistical Natural LanguageProcessing[J].Information Retrieval,2001,4(1):80-81.
[19]CARION N,MASSA F,SYNNAEVE G,et al.End-to-End Object Detection with Transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229.
[20]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.AnImage is Worth 16x16 Words:Transformers for Image Recognition at Scale[J].arXiv:2010.11929,2020.
[21]BA J L,KIROS J R,HINTON G E.Layer Normalization[J].arXiv:1607.06450,2016.
[22]LOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning.PMLR,2015:448-456.
[23]LECUN Y,BENGIO Y.Convolutional networks for images,speech,and time series[M]//The Handbook of Brain Theory and Neural Networks.MIT press,1998:255-258.
[24]DAI J,QI H,XIONG Y,et al.Deformable Convolutional Networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:764-773.
[25]ZHANG S,CHI C,YAO Y,et al.Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection[C]//2020 IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition(CVPR).IEEE,2020.
[26]FANG L P,HE H J,ZHOU G M.A review of target detection algorithm research[J].Computer Engineering and Applications,2018,54(13):11-18,33.

相关文章 15

[1]	刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉. 基于边框距离度量的增量目标检测方法 Incremental Object Detection Method Based on Border Distance Measurement 计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[2]	王灿, 刘永坚, 解庆, 马艳春. 基于软标签和样本权重优化的Anchor Free目标检测算法 Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization 计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[3]	祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋. 改进Faster R-CNN的光学遥感飞机目标检测 Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN 计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[4]	马宾, 付永康, 王春鹏, 李健, 王玉立. 基于GDIoU损失函数的YOLOv4绝缘子高效定位算法 High Performance Insulators Location Scheme Based on YOLOv4 with GDIoU Loss Function 计算机科学, 2022, 49(6A): 412-417. https://doi.org/10.11896/jsjkx.210600089
[5]	陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强. 基于外接圆半径差损失的实时安全帽检测算法 Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss 计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252
[6]	孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[7]	陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳. 三维城市场景中的小物体检测 Small Object Detection in 3D Urban Scenes 计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[8]	胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇. 深度卷积神经网络图像实例分割方法研究进展 Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network 计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[9]	范新南, 赵忠鑫, 严炜, 严锡君, 史朋飞. 结合注意力机制的多尺度特征融合图像去雾算法 Multi-scale Feature Fusion Image Dehazing Algorithm Combined with Attention Mechanism 计算机科学, 2022, 49(5): 50-57. https://doi.org/10.11896/jsjkx.210400093
[10]	徐涛, 陈奕仁, 吕宗磊. 基于改进YOLOv3的机坪工作人员反光背心检测研究 Study on Reflective Vest Detection for Apron Workers Based on Improved YOLOv3 Algorithm 计算机科学, 2022, 49(4): 239-246. https://doi.org/10.11896/jsjkx.210200119
[11]	张侣, 周博文, 吴亮红. 基于改进卷积注意力模块与残差结构的SSD网络 SSD Network Based on Improved Convolutional Attention Module and Residual Structure 计算机科学, 2022, 49(3): 211-217. https://doi.org/10.11896/jsjkx.201200019
[12]	邵海琳, 季怡, 刘纯平, 徐云龙. 基于增强特征金字塔网络的场景文本检测算法 Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network 计算机科学, 2022, 49(2): 248-255. https://doi.org/10.11896/jsjkx.201100072
[13]	顾曦龙, 宫宁生, 胡乾生. 基于YOLOv3与改进VGGNet的车辆多标签实时识别算法 Multi-label Vehicle Real-time Recognition Algorithm Based on YOLOv3 and Improved VGGNet 计算机科学, 2022, 49(11A): 210600142-7. https://doi.org/10.11896/jsjkx.210600142
[14]	黄扬林, 胡凯, 郭建强, 彭诚. 基于多尺度特征融合和双重注意力机制的肝脏CT图像分割 Liver CT Images Segmentation Based on Multi-scale Feature Fusion and Dual AttentionMechanism 计算机科学, 2022, 49(11A): 210800162-9. https://doi.org/10.11896/jsjkx.210800162
[15]	车爱博, 张辉, 李晨, 王耀南. 基于点云数据的交通环境下单阶段三维目标检测方法 Single-stage 3D Object Detector in Traffic Environment Based on Point Cloud Data 计算机科学, 2022, 49(11A): 210900079-6. https://doi.org/10.11896/jsjkx.210900079

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed