计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241000184-7.doi: 10.11896/jsjkx.241000184

• 计算机图形学&多媒体 • 上一篇    下一篇

RMSFF-SSD:基于重参数化与多尺度特征融合的遥感图像目标检测模型

陈海燕, 马舒豪, 张振霄   

  1. 兰州理工大学计算机与通信学院 兰州 730050
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 陈海燕(chenhaiyan@sina.com)
  • 基金资助:
    国家自然科学基金(62161019,62061024)

RMSFF-SSD:Remote Sensing Image Object Detection Model Based on Reparameterization andMulti-scale Feature Fusion

CHEN Haiyan, MA Shuhao, ZHANG Zhenxiao   

  1. School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China(62161019,62061024).

摘要: 遥感图像目标检测在国土资源调查、灾害监测、军事侦察等领域具有广泛的应用。针对SSD(Single Shot MultiBox Detector)模型在遥感图像目标检测时难以有效提取小目标的特征,从而对小目标检测不利的问题,文中提出了一种基于重参数化与多尺度特征融合的RMSFF-SSD(Reparameterization Multi-Scale Feature Fusion SSD)遥感图像目标检测模型,该模型在SSD模型的基础上进行改进。首先,对SSD的骨干特征提取网络中的卷积层使用具有重参数化性质的卷积来提取特征,同时在重参数化卷积中引入SE注意力机制,以捕获通道之间的依赖关系并抑制无用的特征;其次,将特征提取网络中提取到的特征用多级特征融合的方式对全局信息与局部细节信息进行融合,来进一步增强目标的特征;最后,将融合后所获得的6个不同尺度的特征图用于目标检测。在NWPU VHR-10数据集上进行目标检测实验,实验结果表明,所提出的RMSFF-SSD512目标检测模型平均精度为89.7%,显著高于DSSD(78.7%)模型、FSSD(86.7%)模型、FPN(68.9%)模型、Faster R-CNN(44.2%)模型和YOLOv5(83.7%)模型。

关键词: 重参数化, 特征融合, 遥感检测, SSD, SE注意力机制

Abstract: Remote sensing image target detection has a wide range of applications in fields such as land resource survey,disaster monitoring,and military reconnaissance.In response to the difficulty of SSD(Single Shot MultiBox Detector) models in effectively extracting features of small targets during remote sensing image target detection,which is detrimental to the detection of small targets,this paper proposes a remote sensing image target detection model based on reparameterization and multi-scale feature fusion,named RMSFF-SSD(Reparameterization Multi-Scale Feature Fusion SSD).This model is an improvement based on the SSD model.Firstly,the convolutional layers in the backbone feature extraction network of SSD are replaced with convolutions that have reparameterization properties to extract features,and at the same time,the SE attention mechanism is introduced into the reparameterized convolutions to capture the dependencies between channels and suppress useless features.Secondly,the features extracted by the feature extraction network are fused through multi-level feature fusion to integrate global information and local detail information,further enhancing the target features.Finally,the six different scales of feature maps obtained after fusion are used for target detection.The experimental results of target detection on the NWPU VHR-10 dataset show that the average precision of the proposed RMSFF-SSD512 target detection model is 89.7%,which is significantly higher than the DSSD(78.7%) model,FSSD(86.7%) model,FPN(68.9%) model,Faster R-CNN(44.2%) model,and YOLOv5(83.7%) model.

Key words: Reparameterization, Feature fusion, Remote sensing detection, SSD, SE attention mechanism

中图分类号: 

  • TP391.4
[1]FAN L L,ZHAO H W,ZHAO H Y,et al.Survey of target detection based on deep convolutional neural networks[J].Opt Precision Eng,2020,28(5):1152-1164.
[2]FANG L P,HE H J,ZHOU G M.Research overview of object detection methods[J].Computer Engineering and Applications,2018,54(13):11-18.
[3]ZHAO Z Q,ZHENG P,XU S T,et al.Object detection with deep learning:A review[J].IEEE Transactions on Neural Networks and Learning Systems,2019,30(11):3212-3232.
[4]CHEN Y T,LI Y Y,LV S L,et al.Research on oil spill monitoring of multi-source remote sensing image based on deep semantic segmentation[J].Opt Precision Eng,2020,28(5):1165-1176.
[5]XIAO Y,TIAN Z,YU J,et al.A review of object detection based on deep learning[J].Multimedia Tools and Applications,2020,79:23729-23791.
[6]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Proceedings of the:Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,Part I 14.2016:21-37.
[7]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149.
[8]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:14091556,2014.
[9]DING X,ZHANG X,MA N,et al.Repvgg:Making vgg-styleconvnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742.
[10]PENG Y F,CHEN Y K,ZHAO T,et al.Detection of Object in UAV Aerial Photography Based on Reparameterized Attention[J].Electronics Optics and Control,2024,31(9):81-86.
[11]GAO D Y,CHEN T D,MIAO L.Improved Road Object Detection Algorithm for YOLOv8n[J].Computer Engineering and Applications,2024,60(16):186-197.
[12]DING X,ZHANG X,HAN J,et al.Diverse branch block:Building a convolution as an inception-like unit[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10886-10895.
[13]CHENG Y,WANG W,ZHANG W,et al.A multi-feature fusion and attention network for multi-scale object detection in remote sensing images[J].Remote Sensing,2023,15(8):2096.
[14]FU C-Y,LIU W,RANGA A,et al.Dssd:Deconvolutional single shot detector[J].arXiv:170106659,2017.
[15]LI Z,YANG L,ZHOU F.FSSD:feature fusion single shotmultibox detector[J].arXiv:171200960,2017.
[16]PAN H,JIANG J,CHEN G.TDFSSD:Top-down feature fusion single shot MultiBox detector[J].Signal Processing:Image Communication,2020,89:115987-115996.
[17]TONG X W,ZHANG G J.Camouflaged Object Detection Network Based on Global Multi-scale Feature Fusion[J].Pattern Recognition and Artificial Intelligence,2022,35(12):1122-1130.
[18]WANG C,YANG S,ZHOU L,et al.Research on metal gearend-face defect detection method onadaptive multi-scale feature fusion network[J].Journal of Electronic Measurement and Instrumentation,2023,37(10):153-163.
[19]LIU C,ZHANG S,HU M,et al.Object Detection in RemoteSensing Images Based on Adaptive Multi-Scale Feature Fusion Method[J].Remote Sensing,2024,16(5):907.
[20]ZOU F,XIAO W,JI W,et al.Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image[J].Neural Computing and Applications,2020,32:14549-14562.
[21]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[22]GUO C,FAN B,ZHANG Q,et al.Augfpn:Improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12595-12604.
[23]CHENG G,HAN J,ZHOU P,et al.Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J].ISPRS Journal of Photogrammetry and Remote Sensing,2014,98:119-132.
[24]ALTHOFF L,FARIAS M C,WEIGANG L.Once learning for looking and identifying based on yolo-v5 object detection[C]//Proceedings of the Brazilian Symposium on Multimedia and the Web.2022:298-304.
[25]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!