Computer Science ›› 2024, Vol. 51 ›› Issue (6): 264-271.doi: 10.11896/jsjkx.230300222

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Center Point Target Detection Algorithm Based on Improved Swin Transformer

LIU Jiasen, HUANG Jun   

  1. School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2023-03-29 Revised:2023-08-25 Online:2024-06-15 Published:2024-06-05
  • About author:LIU Jiasen,born in 1999,postgraduate.His main research interests include object detection and deep learning.
    HUANG Jun,born in 1971,Ph.D,professor,master supervisor.His main research interests include object detection and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61771085).

Abstract: Aiming at the shortcomings of Swin Transformer in extracting local feature information and expressing features,this paper proposes a center point target detection algorithm based on improved Swin Transformer to improve its performance in target detection.By adjusting the network structure and introducing a deconvolution module to enhance the network’s ability to extract local feature information,using an adaptive two-dimensional Gaussian kernel and a regression head module to detect the center point of the target,so as to enhance the feature expression ability,and adding a dropout activation function to the Swin Transformer block module to alleviate the network overfitting problem.The improved algorithm is validated on the Pascal VOC and MS COCO 2017 datasets,respectively.The experimental results show that the improved Swin Transformer algorithm achieves an accuracy of 81.1% on the Pascal VOC dataset and 37.2% on the MS COCO dataset,significantly superior to other mainstream object detection algorithms.

Key words: Deep learning, Image processing, Object detection, Deconv, Swin Transformer

CLC Number: 

  • TP391
[1]CHEN K Q,ZHU Z L,DENG X M,et al.Deep learning for Multi-Scale Object Detection:A survery[J].Journal of Software,2021,32(4):1201-1227.
[2]BAO S M,WANG S Q.Overview of Object Detection Algorithms Based on Deep Learning[J].Transducer and Microsystem Technologies,2022,41(4):5-9.
[3]HAN C,GAO G,ZHANG Y.Real time small traffic sign detection with revised faster-RCNN[J].Multimedia Tools and Applications,2018,7(10):13263-13278.
[4]REDMON J,FARHADI A.YOLOv3:An inc-remental improvement[J].arXiv:1804.02767,2018.
[5]LI X J,DENG Y M,CHENG Z H,et al.Improved YOLOv5 algorithm for airport runway foreign object detection[J].Computer Engineering and Applications,2023,59(2):202-211.
[6]TIAN Z,SHEN C H,CHEN H,et al.FCOS fully convolutionalone-stage object detection[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington USA,2019:9626-9635.
[7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[J].arXiv:1706.03762,2017.
[8]DOSOVITSKIY A,BEYER L,KOESNIKOV A,et al.An Image is Worth 16×16 Words:Transformers for Image Recognition at Scale[C]//International Conference on Learning Representations.Online:ICLR,2021:3-7.
[9]LIU Z,LIN Y T,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE International Conference on Computer Vision. Montreal.Canada,2021:11-18.
[10]FU C Y,LIU W,RANGA A,et al.DSSD:Deconvolutional Single Shot Detector[J].arXiv:1701.06659,2017.
[11]ZHOU Y,LIU Y,LU J,et al.DIT:A Deformation InvariantTransformer Network for Unsupervised Keypoint Discovery and Description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2020:12630-12639.
[12]ZHOU X,WANG D,KRÄHENBÜL P.Objects as Points[J].arXiv:1904.07850,2019.
[13]HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].arXiv:1207.0580,2012.
[14]WANG C,LIU Y J,XIE Q,et al.Anchor Free object detection algorithm based on soft labeland sample weight optimization[J].Computer Science,2022,49(8):157-164.
[15]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[C]//Proceedings of Conference on Computer Vision.Berlin,Germany,2014:740-755.
[16]LIU W,ANGUELOV D,ERHAND,et al.SSD:Single shotmultibox detector[C]//Computer Vision-SCCV 2016.Amsterdam,2016:21-37.
[17]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of Conference on Computer Vision.Venice,2017:2980-2988.
[1] LIU Chunling, QI Xuyan, TANG Yonghe, SUN Xuekai, LI Qinghao, ZHANG Yu. Summary of Token-based Source Code Clone Detection Techniques [J]. Computer Science, 2024, 51(6): 12-22.
[2] KONG Jialin, ZHANG Qi, WANG Caiyong. Review of Heterogeneous Iris Recognition [J]. Computer Science, 2024, 51(6): 186-197.
[3] LI Yuehao, WANG Dengjiang, JIAN Haifang, WANG Hongchang, CHENG Qinghua. LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction [J]. Computer Science, 2024, 51(6): 215-222.
[4] LI Zekai, BAI Zhengyao, XIAO Xiao, ZHANG Yihan, YOU Yilin. Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework [J]. Computer Science, 2024, 51(6): 231-238.
[5] LIAO Junshuang, TAN Qinhong. DETR with Multi-granularity Spatial Attention and Spatial Prior Supervision [J]. Computer Science, 2024, 51(6): 239-246.
[6] GAO Nan, ZHANG Lei, LIANG Ronghua, CHEN Peng, FU Zheng. Scene Text Detection Algorithm Based on Feature Enhancement [J]. Computer Science, 2024, 51(6): 256-263.
[7] JIANG Rui, YANG Kaihui, WANG Xiaoming, LI Dapeng, XU Youyun. Attentional Interaction-based Deep Learning Model for Chinese Question Answering [J]. Computer Science, 2024, 51(6): 325-330.
[8] BAO Kainan, ZHANG Junbo, SONG Li, LI Tianrui. ST-WaveMLP:Spatio-Temporal Global-aware Network for Traffic Flow Prediction [J]. Computer Science, 2024, 51(5): 27-34.
[9] ZHANG Jianliang, LI Yang, ZHU Qingshan, XUE Hongling, MA Junwei, ZHANG Lixia, BI Sheng. Substation Equipment Malfunction Alarm Algorithm Based on Dual-domain Sparse Transformer [J]. Computer Science, 2024, 51(5): 62-69.
[10] HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan. Cross-modal Information Filtering-based Networks for Visual Question Answering [J]. Computer Science, 2024, 51(5): 85-91.
[11] SONG Jianfeng, ZHANG Wenying, HAN Lu, HU Guozheng, MIAO Qiguang. Multi-stage Intelligent Color Restoration Algorithm for Black-and-White Movies [J]. Computer Science, 2024, 51(5): 92-99.
[12] BAI Xuefei, SHEN Wucheng, WANG Wenjian. Salient Object Detection Based on Feature Attention Purification [J]. Computer Science, 2024, 51(5): 125-133.
[13] HE Xiaohui, ZHOU Tao, LI Panle, CHANG Jing, LI Jiamian. Study on Building Extraction from Remote Sensing Image Based on Multi-scale Attention [J]. Computer Science, 2024, 51(5): 134-142.
[14] WU Xiaoqin, ZHOU Wenjun, ZUO Chenglin, WANG Yifan, PENG Bo. Salient Object Detection Method Based on Multi-scale Visual Perception Feature Fusion [J]. Computer Science, 2024, 51(5): 143-150.
[15] JIAN Yingjie, YANG Wenxia, FANG Xi, HAN Huan. 3D Object Detection Based on Edge Convolution and Bottleneck Attention Module for Point Cloud [J]. Computer Science, 2024, 51(5): 162-171.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!