计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 214-221.doi: 10.11896/jsjkx.241000019

• 计算机图形学&多媒体 • 上一篇    下一篇

改进RT-DETR的遥感图像小目标检测算法

沈涛1, 张秀再1,2, 许岱1   

  1. 1 南京信息工程大学电子与信息工程学院 南京 210044
    2 南京信息工程大学江苏省大气环境与装备技术协同创新中心 南京 210044
  • 收稿日期:2024-10-08 修回日期:2025-02-07 出版日期:2025-08-15 发布日期:2025-08-08
  • 通讯作者: 张秀再(zxzhering@163.com)
  • 作者简介:(1415049974@qq.com)
  • 基金资助:
    国家社会科学基金一般项目“中美人工智能战略比较研究”(22BZZ080)

Improved RT-DETR Algorithm for Small Object Detection in Remote Sensing Images

SHEN Tao1, ZHANG Xiuzai1,2, XU Dai1   

  1. 1 School of Electronic and Information Engineering,Nanjing University of Information Science & Technology,Nanjing 210044,China
    2 Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative,Nanjing University of Information Science & Technology,Nanjing 210044,China
  • Received:2024-10-08 Revised:2025-02-07 Online:2025-08-15 Published:2025-08-08
  • About author:SHEN Tao,born in 2000,postgraduate.His main research interests include deep learning and object detection.
    ZHANG Xiuzai,born in 1989,Ph.D.His main research interests include meteoro-logical communication technology and security,and machine learning.
  • Supported by:
    Comparative Study on Artificial Intelligence Strategies Between China and America of National Social Science Foundation of China(22BZZ080).

摘要: 针对遥感图像目标检测算法漏检率和误检率高,且对小目标检测效果差的问题,提出一种改进RT-DETR(Real-Time Detection Transformer)的目标检测算法。为提升模型对遥感图像中不同尺寸目标的检测能力,采用可变核卷积与多样化分支结构,丰富多尺度表征能力;为避免下采样造成小目标信息丢失的问题,采用Haar小波下采样保留尽可能多的特征信息;针对小目标特征信息在复杂的网络迭代与池化中丢失的问题,设计SPABC3模块,通过对称激活函数和残差连接增强检测目标信息和抑制冗余信息。实验结果表明,改进RT-DETR算法在VisDrone2019数据集和RSOD数据集上,mAP@0.5分别达到42.7%和95.3%,优于其他对比主流算法,提升了对遥感图像中小目标的检测精度,满足遥感图像小目标的检测需求。

关键词: 小目标检测, RT-DETR, 可变核卷积, Haar小波下采样, Swift无参数注意力

Abstract: To address the high miss rate and false detection rate of target detection algorithms in remote sensing images and the poor performance in detecting small objects,this paper proposes an improved RT-DETR target detection algorithm.To enhance the model'scapability of detecting targets of different sizes in remote sensing images,variable kernel convolution and diversified branch structures are employed to enrich multi-scale representation capabilities.To avoid the loss of small object information due to downsampling,Haar wavelet downsampling is used to retain as much feature information as possible.To prevent the loss of small object feature information during complex network iterations and pooling,the SPABC3 module is designed to enhance high-contribution information and suppress redundant information through symmetric activation functions and residual connections.Experimental results show that the improved RT-DETR algorithm achieves mAP@0.5 of 42.7% and 95.3% on the VisDrone2019 dataset and RSOD dataset,outperforming other mainstream comparison algorithms and improving the detection accuracy of small objects in remote sensing images,thereby meeting the detection requirements for small objects in remote sensing images.

Key words: Small object detection, RT-DETR, AKConv, Haar wavelet downsampling, Swift parameter-free attention

中图分类号: 

  • TP393
[1]ZOU Z,CHEN K,SHI Z,et al.Object detection in 20 years:A survey[C]//Proceedings of the IEEE.2023:257-276.
[2]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[3]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Singleshotmultibox detector[C]//Proceedings of European Conference on Computer Vision.Cham:Springer,2016:21-37.
[4]CARION N,MASSA F,SYNNAEVE G,et al.End-to-End Object Detection with Transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229.
[5]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition.2014:580-587.
[6]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[7]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[8]ZHOU H P,ZHANG J.Context information fusion and attention of remote sensing of small target detection [J].Journal of Jilin Normal University(Natural Science Edition),2024(1):117-125.
[9]WANG J Z,LIU Z D,WANG X N,et al.Aerial image detection method based on improved YOLOv5 [J].Communication and Information Technology,2024(1):29-33.
[10]ZUO L,NIU X W,ZHU C H.The aerial remote sensing image detection model based on improved YOLOX [J].Journal of Electronic Measurement Technology,2023,46(16):179-186.
[11]XIAO J S,YAO Y T,ZHOU J,et al.FDLR-Net:A feature decoupling and localization refinement network for object detection in remote sensing images[J].Expert Systems with Applications,2023,225:120068.
[12]ZHANG J,DING A,LI G,et al.A pyramid attention network with edge information injection for remote sensing object detection[J].IEEE Geoscience and Remote Sensing Letters,2023,20:1-5.
[13]LU W,XU S.,ZHAO Y,et al.DETRs Beat YOLOs on Real-time Object Detection[J].arXiv:2304.08069,2023.
[14]GAO Y,LI K,MOSALAM K M,et al.Deep Residual Net with Transfer Learning for Image-based Structural Damage Recognition Deep Residual Network with Transfer Learning[C]//11th National Conference on Earthquake Engineering.2018.
[15]NEUBECK A,VAN G L.Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition.IEEE,2006:850-855.
[16]ZHANG X,SONG Y,SONG T,et al.AKConv:Convolutional Kernel with Arbitrary Sampled Shapes and Arbitrary Number of Parameters[J].arXiv:2311.11587,2023.
[17]DING X H,ZHANG X Y,HAN J G,et al.Diverse BranchBlock:BuildingaConvolutionas an Inception-LikeUnit[C]//Proceedings of theI EEEConference on Computer Visionand Pattern Recognition.2021:10886-10895.
[18]XU G P,LIAO W T,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819.
[19]WAN C,YU H Y,LI Z Q,et al.Swift Parameter-free Attention Network for Efficient Super-Resolution[J].arXiv:2311.12770,2023.
[20]VARGHESE R,SAMBATH M.YOLOv8:A Novel Object Detection Algorithm with Enhanced Performance and Robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems(ADICS).2024:1-6.
[21]WANG A,CHEN H,LIU L,et al.YOLOv10:Real-Time End-to-End Object Detection[C]//Advances in Neural Information Processing Systems.2024:107984-108011.
[22]ZHU X,SU W,LU L,et al.Deformable DETR:DeformableTransformers for End-to-End Object Detection[J].arXiv:2010.04159,2020.
[23]ZHANG H,LI F,LIU S,et al.DINO:DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection[J].ar-Xiv:2203.03605,2022.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!