Computer Science ›› 2023, Vol. 50 ›› Issue (6): 194-199.doi: 10.11896/jsjkx.220700145

• Computer Graphics & Multimedia • Previous Articles     Next Articles

PSwin:Edge Detection Algorithm Based on Swin Transformer

HU Mingyang1,2, GUO Yan1,2, JIN Yangshuang2   

  1. 1 Suzhou Institute for Advanced Research,University of Science and Technology of China,Suzhou,Jiangsu 215123,China
    2 School of Software Engineering,University of Science and Technology of China,Suzhou,Jiangsu 215123,China
  • Received:2022-07-15 Revised:2022-10-27 Online:2023-06-15 Published:2023-06-06
  • About author:HU Mingyang,born in 1997,master.His main research interests include computer vision and natural language processing.GUO Yan,born in 1981,lecturer.Her main research interests include information security,blockchain and NLP.

Abstract: As a traditional computer vision algorithm,edge detection has been widely used in real-world scenarios such as license plate recognition and optical character recognition.When edge detection is used as the basis for higher-level algorithms,such as target detection,semantic segmentation and other algorithms.Edge detection can also be applied to urban security,autonomous driving and other fields.A good edge detection algorithm can effectively improve the efficiency and accuracy of the above compu-ter vision tasks.The difficulty of the edge extraction task lies in the size of the target and the difference of edge details,so the edge extraction algorithm needs to be able to effectively deal with edges of different scales.In this paper,the Transformer is applied to the edge extraction task for the first time,and a novel feature pyramid network is proposed to make full use of the multi-scale and multi-level features of the backbone network.PSwin uses a self-attention mechanism,which can extract global structural information in images more efficiently than convolutional neural network architectures.When evaluated on the BSDS500 dataset,the proposed PSwin edge detection algorithm achieves the best performance,with an ODS F-measure of 0.826 and an OIS of 0.841.

Key words: Edge detection, Feature pyramid network, Visual attention, Transfer learning, BSDS500

CLC Number: 

  • TP391
[1]ARBELÁEZ P,MAIRE M,FOWLKES C,et al.Contour Detection and Hierarchical Image Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(5):898-916.
[2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[3]LIU Y,CHENG M M,HU X,et al.Richer convolutional fea-tures for edge detection[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition,CVPR 2017.2017:5872-5881.
[4]XIE S,TU Z.Holistically-Nested Edge Detection[J].International Journal of Computer Vision,2017,125(1/2/3):3-18.
[5]BERTASIUS G,SHI J,TORRESANI L.Deepedge:A multi-scale bifurcated deep network for top-down contour detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:4380-4389.
[6]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[7]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Object Detectors Emerge in Deep Scene CNNs[J].arXiv:1412.6856,2014.
[8]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).2019:4171-4186.
[9]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.AnImage is Worth 16x16 Words:Transformers for Image Recognition at Scale[J].arXiv:2010.11929,2020.
[10]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[11]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[12]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2015:3431-3440.
[13]CANNY J.A Computational Approach to Edge Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1986,PAMI-8(6):679-698.
[14]TAN M,PANG R,LE Q.EfficientDet:Scalable and EfficientObject Detection[C]//2020 IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2020:10778-10787.
[15]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]/ IEEE Transactions on Pattern Analysis and Machine Intelligence.2015:3431-3440.
[16]CANNY J.A Computational Approach to Edge Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1986,PAMI-8(6):679-698.
[17]TAN M,PANG R,LE Q.EfficientDet:Scalable and Efficient Object Detection[C]//2020 IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2020:10778-10787.
[18]GANIN Y,LEMPITSKY V.N^4-Fields:Neural Network Nearest Neighbor Fields for Image Transforms[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:536-551.
[19]HE J,ZHANG S,YANG M,et al.Bi-directional cascade network for perceptual edge detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2019:3823-3832.
[20]ZHAO H,SHI J,QI X,et al.Pyramid Scene Parsing Network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6230-6239.
[21]PENG Z,HUANG W,GU S,et al.Conformer:Local FeaturesCoupling Global Representations for Visual Recognition[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021.
[22]TOUVRON H,CORD M,DOUZE M,et al.Training data-effi-cient image transformers &distillation through attention[C]//Proceedings of the 38th International Conference on Machine Learning.2021:10347-10357.
[23]YUAN L,CHEN Y,WANG T,et al.Tokens-to-Token ViT:Training Vision Transformers from Scratch on ImageNet[C]//ICCV2021.2021:558-567.
[24]CARION N,MASSA F,SYNNAEVE G,et al.End-to-End Object Detection with Transformers[C]//ECCV 2020.2020:213-229.
[1] WANG Tianran, WANG Qi, WANG Qingshan. Transfer Learning Based Cross-object Sign Language Gesture Recognition Method [J]. Computer Science, 2023, 50(6A): 220300232-5.
[2] ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216.
[3] WANG Xiaofei, FAN Xueqiang, LI Zhangwei. Improving RNA Base Interactions Prediction Based on Transfer Learning and Multi-view Feature Fusion [J]. Computer Science, 2023, 50(3): 164-172.
[4] BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207.
[5] HU Zhongyuan, XUE Yu, ZHA Jiajie. Survey on Evolutionary Recurrent Neural Networks [J]. Computer Science, 2023, 50(3): 254-265.
[6] FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77.
[7] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[8] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[9] PENG Yun-cong, QIN Xiao-lin, ZHANG Li-ge, GU Yong-xiang. Survey on Few-shot Learning Algorithms for Image Classification [J]. Computer Science, 2022, 49(5): 1-9.
[10] TAN Zhen-qiong, JIANG Wen-Jun, YUM Yen-na-cherry, ZHANG Ji, YUM Peter-tak-shing, LI Xiao-hong. Personalized Learning Task Assignment Based on Bipartite Graph [J]. Computer Science, 2022, 49(4): 269-281.
[11] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[12] ZHANG Shu-meng, YU Zeng, LI Tian-rui. Transferable Emotion Analysis Method for Cross-domain Text [J]. Computer Science, 2022, 49(3): 218-224.
[13] SHAO Hai-lin, JI Yi, LIU Chun-ping, XU Yun-long. Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network [J]. Computer Science, 2022, 49(2): 248-255.
[14] SUN Fu-quan, ZOU Peng, CUI Zhi-qing, ZHANG Kun. Classification Algorithm of Diabetic Retinopathy Based on Attention Mechanism [J]. Computer Science, 2022, 49(11A): 211000213-5.
[15] DENG Peng-fei, GUAN Zheng, WANG Yu-yang, WANG Xue. Identification Method of Maize Disease Based on Transfer Learning and Model Compression [J]. Computer Science, 2022, 49(11A): 211200009-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!