Computer Science ›› 2024, Vol. 51 ›› Issue (3): 165-173.doi: 10.11896/jsjkx.230200030

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Object Detection Method with Multi-scale Feature Fusion for Remote Sensing Images

ZHANG Yang, XIA Ying   

  1. School of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2023-02-06 Revised:2023-03-10 Online:2024-03-15 Published:2024-03-13
  • About author:ZHANG Yang,born in 1997,postgra-duate.His main research interests include remote sensing and object detection.XIA Ying,born in 1972,Ph.D,professor,Ph.D supervisor,is a senior member of CCF(No.10248S).Her main research interests include spatiotemporal big data and cross-media retrieval.
  • Supported by:
    National Natural Science Foundation of China(41871226) and Key Cooperation Projects of Chongqing Municipal Education Commission(HZ2021008).

Abstract: Object detection for remote sensing images is an important research direction in the field of computer vision,which is widely used in military and civil fields.The objects in remote sensing images have the characteristics of multiple scales,dense arrangement and similarity between classes,so that the object detection methods used in natural images have many omissions and false detection in remote sensing images.To address this problem,this paper proposes an object detection method with multi-scale feature fusion based on YOLOv5 for remote sensing images.Firstly,a residual unit fusing multi-head self-attention is introduced into the backbone network,through which multi-level feature information is fully extracted and semantic differences among diffe-rent scales were reduced.Secondly,a feature pyramid network fusing lightweight upsampling operators is introduced for obtaining high level semantic features and low-level detail ones.And the feature maps with richer feature information could be acquired by feature fusion,which improves the feature resolution of objects at different scales.The performance of the proposed method is evaluated on the datasets DOTA and NWPU VHR-10,and the accuracy(mAP) of the method isimproved by 1.5% and 2.0%,respectively,compared with the baseline model.

Key words: Remote sensing images, Object detection, Multi-scale features, Feature fusion, YOLOv5

CLC Number: 

  • TP753
[1]SUN X,WANG P,YAN Z,et al.FAIR1M:A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing,2022,184:116-130.
[2]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[3]REN S Q,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[C]//Proceedings of the 28th International Conference on Neural Information Processing System.Montréal:MIT Press,2015:91-99.
[4]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[C]// Advances in Neural Information Processing Systems.Curran Associates Inc.,2016:379-387.
[5]HE K,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[6]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[7]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//EuropeanConference on Computer Vision.Cham:Springer,2016:21-37.
[8]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[9]ZHANG L,ZHANG Y S,YU Y,et al.Survey on object detection in tilting box for remote sensing images[J].National Remote Sensing Bulletin,2022,26(9):1723-1743.
[10]ZHU W T,LAN X C,LUO H L,et al.Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN[J].Computer Science,2022,49(S1):378-383.
[11]SHA M M,LI Y,LI A.Multiscale aircraft detection in optical remote sensing imagery based on advanced Faster R-CNN[J].National Remote Sensing Bulletin,2022,26(8):1624-1635.
[12]DENG R Z,CHEN Q H,CHEN Q,et al.A deformable feature pyramid network for ship detection from remote sensing images[J].Acta Geodaetica et CartographicaSinica,2020,49(6):787-797.
[13]YU Y,AI H,HE X J,et al.Attention-based feature pyramid networks for ship detection of optical remote sensing image[J].National Remote Sensing Bulletin,2020,24(2):107-115.
[14]ZHU M C,FENG T,ZHANG Y.Remote sensing image multi-target detection method based on FD-SSD[J].Computer Applications and Software,2019,36(1):238-244.
[15]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[16]JIANG S J,LUO B,HE P,et al.Vehicle Speed Detection by Multi-source Images from UAV[J].Acta Geodaetica et CartographicaSinica,2018,47(9):1228-1237.
[17]YANG X,YAN J,FENG Z,et al.R3det:Refined single-stage detector with feature refinement for rotating object[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:3163-3171.
[18]YANG X,YAN J.Arbitrary-oriented object detection with circular smooth label[C]//European Conference on Computer Vision.Cham:Springer,2020:677-694.
[19]DING J,XUE N,LONG Y,et al.Learning roi transformer for oriented object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2849-2858.
[20]WANG J,CHEN Y,GAO M,et al.Improved YOLOv5 network for real-time multi-scale traffic sign detection[J].arXiv:2112.08782,2021.
[21]ZHU X,LYU S,WANG X,et al.TPH-YOLOv5:ImprovedYOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:2778-2788.
[22]GLENN J,ALEX S,JIRKA B:YOLOv5[EB/OL].[2021-04-12].https://github.com/ultralytics/yolov5.
[23]WANG J,CHEN K,XU R,et al.Carafe:Content-aware reassembly of features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3007-3016.
[24]SRINIVAS A,LIN T Y,PARMAR N,et al.Bottleneck transformers for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:16519-16529.
[25]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[26]XIA G S,BAI X,DING J,et al.DOTA:A large-scale dataset for object detection in aerial images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3974-3983.
[27]CHENG G,ZHOU P,HAN J.Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing,2016,54(12):7405-7415.
[1] XU Hao, LI Fengrun, LU Lu. Metal Surface Defect Detection Method Based on Dual-stream YOLOv4 [J]. Computer Science, 2024, 51(4): 209-216.
[2] LIU Zeyu, LIU Jianwei. Video and Image Salient Object Detection Based on Multi-task Learning [J]. Computer Science, 2024, 51(4): 217-228.
[3] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[4] HAO Ran, WANG Hongjun, LI Tianrui. Deep Neural Network Model for Transmission Line Defect Detection Based on Dual-branch Sequential Mixed Attention [J]. Computer Science, 2024, 51(3): 135-140.
[5] QIAO Fan, WANG Peng, WANG Wei. Multivariate Time Series Classification Algorithm Based on Heterogeneous Feature Fusion [J]. Computer Science, 2024, 51(2): 36-46.
[6] ZHANG Guodong, CHEN Zhihua, SHENG Bin. Infrared Small Target Detection Based on Dilated Convolutional Conditional GenerativeAdversarial Networks [J]. Computer Science, 2024, 51(2): 151-160.
[7] ZHAO Jiangfeng, HE Hongjie, CHEN Fan, YANG Shubin. Two-stage Visible Watermark Removal Model Based on Global and Local Features for Document Images [J]. Computer Science, 2024, 51(2): 172-181.
[8] LIU Changxin, WU Ning, HU Lirui, GAO Ba, GAO Xueshan. Recursive Gated Convolution Based Super-resolution Network for Remote Sensing Images [J]. Computer Science, 2024, 51(2): 205-216.
[9] WANG Weijia, XIONG Wenzhuo, ZHU Shengjie, SONG Ce, SUN He, SONG Yulong. Method of Infrared Small Target Detection Based on Multi-depth Feature Connection [J]. Computer Science, 2024, 51(1): 175-183.
[10] SHI Dianxi, LIU Yangyang, SONG Linna, TAN Jiefu, ZHOU Chenlei, ZHANG Yi. FeaEM:Feature Enhancement-based Method for Weakly Supervised Salient Object Detection via Multiple Pseudo Labels [J]. Computer Science, 2024, 51(1): 233-242.
[11] YANG Yi, SHEN Sheng, DOU Zhiyang, LI Yuan, HAN Zhenjun. Tiny Person Detection for Intelligent Video Surveillance [J]. Computer Science, 2023, 50(9): 75-81.
[12] CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng. Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion [J]. Computer Science, 2023, 50(9): 202-209.
[13] ZHU Ye, HAO Yingguang, WANG Hongyu. Deep Learning Based Salient Object Detection in Infrared Video [J]. Computer Science, 2023, 50(9): 227-234.
[14] LIU Yubo, GUO Bin, MA Ke, QIU Chen, LIU Sicong. Design of Visual Context-driven Interactive Bot System [J]. Computer Science, 2023, 50(9): 260-268.
[15] ZHOU Fengfan, LING Hefei, ZHANG Jinyuan, XIA Ziwei, SHI Yuxuan, LI Ping. Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion [J]. Computer Science, 2023, 50(8): 280-285.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!