计算机科学 ›› 2025, Vol. 52 ›› Issue (10): 151-158.doi: 10.11896/jsjkx.250100097
张弘森1,2, 吴蔚2, 徐建2, 吴飞1,2, 季一木3
ZHANG Hongsen1,2, WU Wei2, XU Jian2, WU Fei1,2, JI Yimu3
摘要: 在舰船检测任务中,SAR图像因其优异的成像条件被广泛应用于海洋资源管理、海上救援等场景。然而,舰船目标尺寸较小和海面杂波等问题,导致传统目标检测算法的性能表现不佳。近年来,许多算法通过引入Transformer的注意力机制,实现更好的语义解释;或采用较为复杂的网络结构,以提高特征提取能力。这在一定程度上改善了检测精度,却牺牲了检测速度。对此,提出了一种基于小目标特征增强RT-DETR的SAR图像舰船目标检测方法。该方法由以下3部分组成:1)大模型提示生成网络:借助多模态大模型的零样本学习能力生成提示,以提取图像模态中更具判别性的信息;2)AIFI-EAA模块:以RT-DETR为基线,改进尺度内特征交互模块,引入高效加性注意力机制,降低算法计算复杂度;3)轻量化小目标特征增强融合网络:在多尺度特征融合网络中加入小目标检测层,设计CSP-OmniKernel模块进行多尺度特征融合,提升小目标的检测性能。在SSDD,HRSID和SAR-Ship-Dataset 3个公开数据集上进行实验验证,结果表明所提方法在准确性上具有优势。
中图分类号:
[1]CUI Z,QUAN H,CAO Z,et al.Sar target cfar detection via gpu oarallel operation[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2018,11(12):4884-4894. [2]LIN H,LIU J,LI X,et al.Dcea:Detr with concentrated deformable attention for end-to-end ship detection in sar images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2024,17:17292-17307. [3]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149. [4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788. [5]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.2020:213-229. [6]ZHAO Y,LV W,XU S,et al.Detrs beat yolos on real-time object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:16965-16974. [7]LIU L,FU L,ZHANG Y,et al.Clfr-det:Cross-level feature refinement detector for tiny-ship detection in sar images[J].Knowledge-Based Systems,2024,284:111284. [8]CAI X,LAI Q,WANG Y,et al.Poly kernel inception network for remote sensing detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:27706-27716. [9]QIN C,WANG X,LIU Y,et al.A novel end-to-end transformer network for small scale ship detection in sar images[C]//International Geoscience and Remote Sensing Symposium.2024:8158-8162. [10]BEYER L,STEINER A,PINTO A S,et al.Paligemma:A versatile 3b vlm for transfer[J].arXiv:2407.07726,2024. [11]SHAKER A,MAAZ M,RASHEED H,et al.Swiftformer:Efficient additive attention for transformer-based real-time mobile vision applications[C]//IEEE/CVF International Conference on Computer Vision.2023:17425-17436. [12]CUI Y,REN W,KNOLL A.Omni-kernel network for image restoration[C]//AAAI Conference on Artificial Intelligence.2024:1426-1434. [13]LI J,QU C,SHAO J.Ship detection in sar images based on an improved faster r-cnn[C]//SAR in Big Data Era:Models,Me-thods and Applications.2017:1-6. [14]WEI S,ZENG X,QU Q,et al.HRSID:A high-resolution sar images dataset for ship detection and instance segmentation[J].IEEE Access,2020,8:120234-120254. [15]WANG Y,WANG C,ZHANG H,et al.A sar dataset of ship detection for deep learning under complex backgrounds[J].Remote Sensing,2019,11(7):765. [16]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [17]CHEN J,LEI B,SONG Q,et al.A hierarchical graph network for 3d object detection on point clouds[C]//IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:392-401. [18]DING X,ZHANG X,MA N,et al.Repvgg:Making vgg-styleconvnets great again[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742. [19]YAN H,LIU Y L,JIN L W,et al.The development,applica-tion,and future of llm similar to chatgpt[J].Journal of Image and Graphics,2023,28(9):2749-2762. [20]LI L H,ZHANG P,ZHANG H,et al.Grounded language-image pre-training[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:10965-10975. [21]XU Y,ZHANG M,FU C,et al.Multi-modal queried object detection in the wild[C]//Advances in Neural Information Processing Systems.2024:1-18. [22]ZHOU K,YANG J,LOY C C,et al.Learning to prompt for vision-language models[J].International Journal of Computer Vision,2022,130(9):2337-2348. [23]KHATTAK M U,RASHEED H,MAAZ M,et al.Maple:Multi-modal prompt learning[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:19113-19122. [24]ZHANG H,LI F,LIU S,et al.Dino:Detr with improved denoi-sing anchor boxes for end-to-end object detection[J].arXiv:2203.03605,2022. [25]ZHU X,SU W,LU L,et al.Deformable detr:Deformable transformers for end-to-end object detection[C]//International Conference on Learning Representations.2021:1-16. [26]ZONG Z,SONG G,LIU Y.Detrs with collaborative hybrid assignments training[C]//IEEE/CVF International Conference on Computer Vision.2023:6748-6758. [27]JOCHER G,NISHIMURA K,MINEEVA T,et al.Yolov8 byultralytics[EB/OL].https://github.com/ultralytics/ultraly-tics. [28]SHEN J,BAI L,ZHANG Y,et al.Ellk-net:An efficient light-weight large kernel network for sar ship detection[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62:5221514. |
|