计算机科学 ›› 2024, Vol. 51 ›› Issue (9): 162-172.doi: 10.11896/jsjkx.230700106
李允臣, 张睿, 王家宝, 李阳, 王梓祺, 陈瑶
LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao
摘要: 无人机高空航拍的目标普遍尺寸小、特征弱,而且受复杂天候条件影响大,导致基于可见光或红外单模态图像的目标检测漏检、误检率较高。对此,提出了重参数化增强的双模态实时目标检测模型DM-YOLO。首先,采用通道拼接的方法融合可见光和红外图像,以极低的成本融合双模态图像的互补信息。其次,提出更加高效的重参数化模块并基于此构建了更加强大的骨干网RepCSPDarkNet,有效增强了骨干网对双模态图像的特征提取能力。然后,提出了多层次特征融合模块,通过多感受野卷积和注意力机制融合弱小目标的多尺度特征信息,增强了弱小目标的多尺度特征表示。最后,删除了对弱小目标检测基本不起作用的特征金字塔深层检测层,在检测精度保持不变的情况下,减小了模型规模。实验结果表明,在大规模的双模态图像数据集DroneVehicle上,DM-YOLO的检测精度比基准YOLOv5s高出2.45%,且优于规模相当的YOLOv6和YOLOv7模型,有效提高了复杂光照条件下目标检测的准确性和鲁棒性,同时检测速度达到82FPS,可满足实时检测的需求。
中图分类号:
[1]NIU W H,YIN M M.Road Small Target Detection Algorithm Based on Improved YOLOv5[J].Chinese Journal of Sensors and Actuators,2023,36(1):36-44. [2]XIE P X,CUI J R,ZHAO M.Electiric Bike Helment Wearing Detection Alogrithm Based on Improved YOLOv5[J].Computer Science,2023,50(S1):420-425. [3]YANG Y H,ZHONG B J,TIAN H W.Target Detection Model of DS-yolov4-Tiny Rescue Robot[J].Computer Simulation,2022,39(1):387-393. [4]GIRSHICK R B,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587. [5]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE In-ternational Conference on Computer Vision.Santiago:IEEE.2015:1440-1448. [6]REN S Q,HE K M,GIRSHICK R B,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[C]//Conference and Workshop on Neural Information Processing Systems.Montreal:MIT Press.2015:91-99. [7]LIN T Y,DOLLÁR P,GIRSHICK R B,et al.Feature Pyramid Networks for Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:936-944. [8]CAI Z W,VASCONCELOS N.Cascade R-CNN:Delving IntoHigh Quality Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:6154-6162. [9]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:779-788. [10]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:6517-6525. [11]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018. [12]BOCHKOVSKIY A,WANG C Y,LIAO H Y.YOLOv4:Optimal Speed and Accuracy of Object Detection[J].arXiv:2004.10934,2020. [13]ULTRALYTICS.YOLOv5[EB/OL].https://github.com/ul-tralytics/yolov5. [14]LI C,LI L L,JIANG H L,et al.YOLOv6:A Single-Stage Object Detection Framework for Industrial Applications[J].arXiv:2209.02976,2022. [15]WANG C Y,ALEXEY B,MARK L,et al.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2023:7464-7475. [16]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of the European Confe-rence on Computer Vision.Amsterdam:Springer.2016:21-37. [17]FU C Y,LIU W,RANGA A,et al.DSSD:Deconvolutional Single Shot Detector[J].arXiv:1701.06659,2017. [18]WU Z,MIAO X D,LI W W,et al.Low-Visibility Road Target Detection Algorithm Based on Infrared and Visible Light Fusion[J].Infrared Technology,2022,44(11):1154-1160. [19]LIU J Y,FAN X,HUANG Z B,et al.Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition,2022:5792-5801. [20]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[C]//Proceedings of the International Conference on Neural Information Processing Systems.Mon-treal,2014:2672-2680. [21]GENG K K,ZOU W,YIN G D,et al.Low-observable targetsdetection for autonomous vehicles based on dual-modal sensor fusion with deep learning approach[J].Journal of Automobile Engineering,2019,233(9):2270-2283. [22]ZHOU H,SUN M,REN X,et al.Visible-Thermal Image Object Detection via the Combination of Illumination Conditions and Temperature Information[J].Remote Sensing,2021,13(18):36-56. [23]CHEN Y T,SHI J G,YE Z L,et al.Multimodal Object Detection via Probabilistic Ensembling[C]//Proceedings of the European Conference on Computer Vision.2022(9):139-158. [24]SUN Y M,CAO B,ZHU P F,et al.Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Lear-ning[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32:6700-6713. [25]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [26]DING X H,ZHANG X Y,MA N N,et al.RepVGG:MakingVGG-Style ConvNets Great Again[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2021:13733-13742. [27]DING X H,ZHANG X Y,HAN J G,et al.Diverse BranchBlock:Building a Convolution as an Inception-Like Unit[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2021:10886-10895. [28]KUMAR P,GABRIEL J,ZHU J,et al.MobileOne:An Im-proved One millisecond Mobile Backbone[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2023:7907-7917. [29]SANDLER M,HOWARD A,ZHU M L,et al.Mobilenetv2:Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4510-4520. [30]HOWARD A,SANDLER M,CHU G,et al.Searching for MobileNetV3[C]//Proceedings of the 2019 IEEE International Conference on Computer Vision.2019:1314-1324. [31]MA N N,ZHANG X Y,ZHENG H T,et al.ShuffleNet V2:Practical Guidelines for Efficient CNN Architecture Design[C]//Proceedings of the European Conference on Computer Vision.2018(14):122-138. [32]HAN K,WANG Y H,TIAN Q,et al.GhostNet:More FeaturesFrom Cheap Operations[C]//Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition.2020:1577-1586. [33]HAN K,WANG Y H,XU C,et al.GhostNets on Heteroge-neous Devices via Cheap Operations[J].International Journal of Computer Vision,2022,130:1050-1069. [34]CHEN C P,GUO Z C,ZENG H E,et al.RepGhost:A Hardware-Efficient Ghost Module via Re-parameterization[J].arXiv:2211.06088,2022. [35]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE.2018:7132-7141. [36]WOO S,PARK J,LEE J Y,et al.CBAM:convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer.2018,11211:3-19. [37]HOU Q B,ZHOU D Q,FENG J S.Coordinate Attention for Ef-ficient Mobile Network Design[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Vir-tual:IEEE.2021:13713-13722. [38]ZHANG H,ZU K K,LU J,et al.EPSANet:An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network[C]//Proceedings of the Asian Conference on Computer Vision.2022:541-557. |
|