Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 241000184-7.doi: 10.11896/jsjkx.241000184

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

RMSFF-SSD:Remote Sensing Image Object Detection Model Based on Reparameterization andMulti-scale Feature Fusion

CHEN Haiyan, MA Shuhao, ZHANG Zhenxiao   

  1. School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China(62161019,62061024).

Abstract: Remote sensing image target detection has a wide range of applications in fields such as land resource survey,disaster monitoring,and military reconnaissance.In response to the difficulty of SSD(Single Shot MultiBox Detector) models in effectively extracting features of small targets during remote sensing image target detection,which is detrimental to the detection of small targets,this paper proposes a remote sensing image target detection model based on reparameterization and multi-scale feature fusion,named RMSFF-SSD(Reparameterization Multi-Scale Feature Fusion SSD).This model is an improvement based on the SSD model.Firstly,the convolutional layers in the backbone feature extraction network of SSD are replaced with convolutions that have reparameterization properties to extract features,and at the same time,the SE attention mechanism is introduced into the reparameterized convolutions to capture the dependencies between channels and suppress useless features.Secondly,the features extracted by the feature extraction network are fused through multi-level feature fusion to integrate global information and local detail information,further enhancing the target features.Finally,the six different scales of feature maps obtained after fusion are used for target detection.The experimental results of target detection on the NWPU VHR-10 dataset show that the average precision of the proposed RMSFF-SSD512 target detection model is 89.7%,which is significantly higher than the DSSD(78.7%) model,FSSD(86.7%) model,FPN(68.9%) model,Faster R-CNN(44.2%) model,and YOLOv5(83.7%) model.

Key words: Reparameterization, Feature fusion, Remote sensing detection, SSD, SE attention mechanism

CLC Number: 

  • TP391.4
[1]FAN L L,ZHAO H W,ZHAO H Y,et al.Survey of target detection based on deep convolutional neural networks[J].Opt Precision Eng,2020,28(5):1152-1164.
[2]FANG L P,HE H J,ZHOU G M.Research overview of object detection methods[J].Computer Engineering and Applications,2018,54(13):11-18.
[3]ZHAO Z Q,ZHENG P,XU S T,et al.Object detection with deep learning:A review[J].IEEE Transactions on Neural Networks and Learning Systems,2019,30(11):3212-3232.
[4]CHEN Y T,LI Y Y,LV S L,et al.Research on oil spill monitoring of multi-source remote sensing image based on deep semantic segmentation[J].Opt Precision Eng,2020,28(5):1165-1176.
[5]XIAO Y,TIAN Z,YU J,et al.A review of object detection based on deep learning[J].Multimedia Tools and Applications,2020,79:23729-23791.
[6]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Proceedings of the:Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,Part I 14.2016:21-37.
[7]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149.
[8]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:14091556,2014.
[9]DING X,ZHANG X,MA N,et al.Repvgg:Making vgg-styleconvnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742.
[10]PENG Y F,CHEN Y K,ZHAO T,et al.Detection of Object in UAV Aerial Photography Based on Reparameterized Attention[J].Electronics Optics and Control,2024,31(9):81-86.
[11]GAO D Y,CHEN T D,MIAO L.Improved Road Object Detection Algorithm for YOLOv8n[J].Computer Engineering and Applications,2024,60(16):186-197.
[12]DING X,ZHANG X,HAN J,et al.Diverse branch block:Building a convolution as an inception-like unit[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10886-10895.
[13]CHENG Y,WANG W,ZHANG W,et al.A multi-feature fusion and attention network for multi-scale object detection in remote sensing images[J].Remote Sensing,2023,15(8):2096.
[14]FU C-Y,LIU W,RANGA A,et al.Dssd:Deconvolutional single shot detector[J].arXiv:170106659,2017.
[15]LI Z,YANG L,ZHOU F.FSSD:feature fusion single shotmultibox detector[J].arXiv:171200960,2017.
[16]PAN H,JIANG J,CHEN G.TDFSSD:Top-down feature fusion single shot MultiBox detector[J].Signal Processing:Image Communication,2020,89:115987-115996.
[17]TONG X W,ZHANG G J.Camouflaged Object Detection Network Based on Global Multi-scale Feature Fusion[J].Pattern Recognition and Artificial Intelligence,2022,35(12):1122-1130.
[18]WANG C,YANG S,ZHOU L,et al.Research on metal gearend-face defect detection method onadaptive multi-scale feature fusion network[J].Journal of Electronic Measurement and Instrumentation,2023,37(10):153-163.
[19]LIU C,ZHANG S,HU M,et al.Object Detection in RemoteSensing Images Based on Adaptive Multi-Scale Feature Fusion Method[J].Remote Sensing,2024,16(5):907.
[20]ZOU F,XIAO W,JI W,et al.Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image[J].Neural Computing and Applications,2020,32:14549-14562.
[21]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[22]GUO C,FAN B,ZHANG Q,et al.Augfpn:Improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12595-12604.
[23]CHENG G,HAN J,ZHOU P,et al.Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J].ISPRS Journal of Photogrammetry and Remote Sensing,2014,98:119-132.
[24]ALTHOFF L,FARIAS M C,WEIGANG L.Once learning for looking and identifying based on yolo-v5 object detection[C]//Proceedings of the Brazilian Symposium on Multimedia and the Web.2022:298-304.
[25]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[1] LUO Chi, LU Lingyun, LIU Fei. Partial Differential Equation Solving Method Based on Locally Enhanced Fourier NeuralOperators [J]. Computer Science, 2025, 52(9): 144-151.
[2] GUO Husheng, ZHANG Xufei, SUN Yujie, WANG Wenjian. Continuously Evolution Streaming Graph Neural Network [J]. Computer Science, 2025, 52(8): 118-126.
[3] LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150.
[4] XU Yongwei, REN Haopan, WANG Pengfei. Object Detection Algorithm Based on YOLOv8 Enhancement and Its Application Norms [J]. Computer Science, 2025, 52(7): 189-200.
[5] FANG Chunying, HE Yuankun, WU Anxin. Emotion Recognition Based on Brain Network Connectivity and EEG Microstates [J]. Computer Science, 2025, 52(7): 201-209.
[6] LUO Xuyang, TAN Zhiyi. Knowledge-aware Graph Refinement Network for Recommendation [J]. Computer Science, 2025, 52(7): 103-109.
[7] SHI Xincheng, WANG Baohui, YU Litao, DU Hui. Study on Segmentation Algorithm of Lower Limb Bone Anatomical Structure Based on 3D CTImages [J]. Computer Science, 2025, 52(6A): 240500119-9.
[8] LI Weirong, YIN Jibin. FB-TimesNet:An Improved Multimodal Emotion Recognition Method Based on TimesNet [J]. Computer Science, 2025, 52(6A): 240900046-8.
[9] XU Yutao, TANG Shouguo. Visual Question Answering Integrating Visual Common Sense Features and Gated Counting Module [J]. Computer Science, 2025, 52(6A): 240800086-7.
[10] WANG Rui, TANG Zhanjun. Multi-feature Fusion and Ensemble Learning-based Wind Turbine Blade Defect Detection Method [J]. Computer Science, 2025, 52(6A): 240900138-8.
[11] LI Mingjie, HU Yi, YI Zhengming. Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique [J]. Computer Science, 2025, 52(6A): 240500129-7.
[12] WANG Rong , ZOU Shuping, HAO Pengfei, GUO Jiawei, SHU Peng. Sand Dust Image Enhancement Method Based on Multi-cascaded Attention Interaction [J]. Computer Science, 2025, 52(6A): 240800048-7.
[13] JIN Lu, LIU Mingkun, ZHANG Chunhong, CHEN Kefei, LUO Yaqiong, LI Bo. Pedestrian Re-identification Based on Spatial Transformation and Multi-scale Feature Fusion [J]. Computer Science, 2025, 52(6A): 240800156-7.
[14] ZHANG Yongyu, GUO Chenjuan, WEI Hanyue. Deep Learning Stock Price Probability Prediction Based on Multi-modal Feature Wavelet Decomposition [J]. Computer Science, 2025, 52(6A): 240600140-11.
[15] SHEN Xinyang, WANG Shanmin, SUN Yubao. Depression Recognition Based on Speech Corpus Alignment and Adaptive Fusion [J]. Computer Science, 2025, 52(6): 219-227.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!