Computer Science ›› 2023, Vol. 50 ›› Issue (12): 130-147.doi: 10.11896/jsjkx.221100076

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Review of Transformer in Computer Vision

CHEN Luoxuan1,2, LIN Chengchuang3, ZHENG Zhaoliang1,2, MO Zefeng1,2, HUANG Xinyi1,2, ZHAO Gansen1,2   

  1. 1 School of Computer Science,South China Normal University,Guangzhou 510663,China
    2 Guangzhou Key Lab on Cloud Computing Security and Assessment Technology,Guangzhou 510663,China
    3 Guangdong Planning and Designing Institute of Telecommunications Co.,Ltd,Guangzhou 510630,China
  • Received:2022-11-09 Revised:2023-02-25 Online:2023-12-15 Published:2023-12-07
  • About author:CHEN Luoxuan,born in 1998,postgra-duate,is a member of China Computer Federation.Her main research interests include computer vision and medical image segmentation.
    ZHAO Gansen,born in 1977,Ph.D,professor.His main research interests include medical artificial intelligence,medical image,and medical data analysis.
  • Supported by:
    National Natural Science Foundation of China(82271267) and National Social Science Fund of China(19ZDA041).

Abstract: Transformer is an attention-based encoder-decoder architecture.Due to its long-range sequence modeling and parallel computing capability,Transformer have made a significant breakthrough in natural language processing and is gradually expanding to computer vision(CV) fields,which has become an important research direction in CV tasks.Three sorts of visual Transformer-based CV task,including classification,object detection and segmentation,are focused on by this paper,which summarizes their application and modification.Starting from image classification,this paper first analyses the existing issue in vision Transformer including data size,structure and computational efficiency,then sorts out the corresponding solutions according to the issue.Besides,this paper provides a literature review on object detection and segmentation,which organizes these methods accor-ding to their structures and motivations and summarizes their corresponding pros and cons.Finally,the challenges and future development trends of the Transformer in vision transformer are summarized and discussed in this paper.

Key words: Vison Transformer, Computer vision, Image classification, Object detection, Image segmentation

CLC Number: 

  • TP391
[1]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Advances in Neural Information Processing Systems,2012,25:1097-1105.
[2]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[3]HUANG Z,WANG X,HUANG L,et al.Ccnet:Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:603-612.
[4]WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803.
[5]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[6]CAO Y,XU J,LIN S,et al.Gcnet:Non-local networks meetsqueeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.2019.
[7]WOO S,PARK J,LEE J-Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[8]BELLO I,ZOPH B,VASWANI A,et al.Attention augmented convolutional networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3286-3295.
[9]ZHAO H,JIA J,KOLTUN V.Exploring self-attention forimage recognition[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:10076-10085.
[10]PARMAR N,VASWANI A,USZKOREIT J,et al.Image transformer[C]//International Conference on Machine Learning.2018:4055-4064.
[11]HU H,GU J,ZHANG Z,et al.Relation networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3588-3597.
[12]VILA L C,ESCOLANO C,FONOLLOSA J A,et al.End-to-End Speech Translation with the Transformer[C]//IberSPEECH.2018:60-63.
[13]TOPAL M O,BAS A,VAN HEERDEN I.Exploring transfor-mers in natural language generation:Gpt,bert,and xlnet[J].ar-Xiv:2102.08036,2021.
[14]LI N,LIU S,LIU Y,et al.Neural speech synthesis with transformer network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6706-6713.
[15]DONG L,XU S,XU B.Speech-transformer:a no-recurrence sequence-to-sequence model for speech recognition[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).2018:5884-5888.
[16]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[R].OpenAI,2018.
[17]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[18]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI blog,2019,1(8):9-32.
[19]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformer[C]//European Conference on Computer Vision.2020:213-229.
[20]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[21]ZHENG S,LU J,ZHAO H,et al.Rethinking semantic segmentation from a sequence-to-sequence perspective with transfor-mers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:6881-6890.
[22]CHEN J,LU Y,YU Q,et al.Transunet:Transformers make strong encoders for medical image segmentation[J].arXiv:2102.04306,2021.
[23]LI G,ZHU L,LIU P,et al.Entangled transformer for image captioning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:8928-8937.
[24]HENDRYCKS D,GIMPEL K.Gaussian error linear units(gelus)[J].arXiv:1606.08415,2016.
[25]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[26]TAN M,LE Q.Efficientnet:Rethinking model scaling for con-volutional neural networks[C]//International Conference on Machine Learning.2019:6105-6114.
[27]XIAO T,DOLLAR P,SINGH M,et al.Early convolutions help transformers see better[J].Advances in Neural Information Processing Systems,2021,34:30392-30400.
[28]GRAHAM B,EL-NOUBY A,TOUVRON H,et al.LeViT:aVision Transformer in ConvNet's Clothing for Faster Inference[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:12259-12269.
[29]WU H,XIAO B,CODELLA N,et al.Cvt:Introducing convolutions to vision transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:22-31.
[30]DAI Z,LIU H,LE Q,et al.Coatnet:Marrying convolution and attention for all data sizes[J].Advances in Neural Information Processing Systems,2021,34:3965-3977.
[31]TOUVRON H,CORD M,DOUZE M,et al.Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning.2021:10347-10357.
[32]D'ASCOLI S,TOUVRON H,LEAVITT M L,et al.Convit:Improving vision transformers with soft convolutional inductive biases[C]//International Conference on Machine Learning.2021:2286-2296.
[33]LI Y,ZHANG K,CAO J,et al.Localvit:Bringing locality to vision transformers[J].arXiv:2104.05707,2021.
[34]WANG W,XIE E,LI X,et al.Pyramid vision transformer:Aversatile backbone for dense prediction without convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:568-578.
[35]YUAN K,GUO S,LIU Z,et al.Incorporating convolution de-signs into visual transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:579-588.
[36]GUO J,HAN K,WU H,et al.Cmt:Convolutional neural networks meet vision transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12175-12185.
[37]PENG Z,HUANG W,GU S,et al.Conformer:Local featurescoupling global representations for visual recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:367-376.
[38]CHEN Y,DAI X,CHEN D,et al.Mobile-former:Bridging mobilenet and transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:5270-5279.
[39]HAN K,XIAO A,WU E,et al.Transformer in transformer[J].Advances in Neural Information Processing Systems,2021,34:15908-15919.
[40]PANG Y,SUN M,JIANG X,et al.Convolution in convolution for network in network[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(5):1587-1597.
[41]YUAN L,CHEN Y,WANG T,et al.Tokens-to-token vit:Training vision transformers from scratch on imagenet[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:558-567.
[42]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[43]DONG X,BAO J,CHEN D,et al.Cswin transformer:A general vision transformer backbone with cross-shaped windows[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12124-12134.
[44]XIA Z,PAN X,SONG S,et al.Vision transformer with defor-mable attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4794-4803.
[45]TU Z,TALEBI H,ZHANG H,et al.Maxvit:Multi-axis vision transformer[J].arXiv:2204.01697,2022.
[46]YANG J,LI C,ZHANG P,et al.Focal self-attention for local-global interactions in vision transformers[J].arXiv:2107.00641,2021.
[47]WANG W,YAO L,CHEN L,et al.Crossformer:A versatile vision transformer based on cross-scale attention[J].arXiv:2108.00154,2021.
[48]YUAN L,HOU Q,JIANG Z,et al.Volo:Vision outlooker for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(5):6575-6586.
[49]WANG W,XIE E,LI X,et al.PVT v2:Improved baselines with Pyramid Vision Transformer[J].Computational Visual Media,2022,8(3):415-424.
[50]CHEN C F R,FAN Q,PANDA R.Crossvit:Cross-attentionmulti-scale vision transformer for image classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:357-366.
[51]HEO B,YUN S,HAN D,et al.Rethinking spatial dimensions of vision transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:11936-11945.
[52]CHU X,TIAN Z,WANG Y,et al.Twins:Revisiting the design of spatial attention in vision transformers[J].Advances in Neural Information Processing Systems,2021,34:9355-9366.
[53]ZHANG P,DAI X,YANG J,et al.Multi-scale vision longfor-mer:A new vision transformer for high-resolution image encoding[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2021:2998-3008.
[54]BELTAGY I,PETERS M E,COHAN A.Longformer:Thelong-document transformer[J].arXiv:2004.05150,2020.
[55]ZHOU D,KANG B,JIN X,et al.Deepvit:Towards deeper vision transformer[J].arXiv:2103.11886,2021.
[56]TOUVRON H,CORD M,SABLAYROLLES A,et al.Goingdeeper with image transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:32-42.
[57]SHAW P,USZKOREIT J,VASWANI A.Self-attention withrelative position representations[J].arXiv:1803.02155,2018.
[58]CHU X,TIAN Z,ZHANG B,et al.Conditional positional encodings for vision transformers[J].arXiv:2102.10882,2021.
[59]FELZENSZWALB P,MCALLESTER D,RAMANAN D.A discriminatively trained,multiscale,deformable part model[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8.
[60]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'05).2005:886-893.
[61]VIOLA P,JONES M.Rapid object detection using a boostedcascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2001).2001.
[62]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[63]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99.
[64]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[65]CAI Z,VASCONCELOS N.Cascade r-cnn:Delving into highquality object detection[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:6154-6162.
[66]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[67]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Tran-sactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[68]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//European Conference on Computer Vision.2016:21-37.
[69]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[70]CAO J L,LI Y L,SUN H Q,et al.A Survey of Visual Object Detection Technology Based on Deep Learning[J].Journal of Image and Graphics,2022,27(6):1697-1722.
[71]ZHU X,SU W,LU L,et al.Deformable detr:Deformable transformers for end-to-end object detection[J].arXiv:2010.04159,2020.
[72]DAI J,QI H,XIONG Y,et al.Deformable convolutional net-works[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:764-773.
[73]MENG D,CHEN X,FAN Z,et al.Conditional detr for fasttraining convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3651-3660.
[74]YAO Z,AI J,LI B,et al.Efficient detr:improving end-to-end object detector with dense prior[J].arXiv:2104.01318,2021.
[75]LIU S,LI F,ZHANG H,et al.DAB-DETR:Dynamic anchor boxes are better queries for DETR[J].arXiv:2201.12329,2022.
[76]GAO P,ZHENG M,WANG X,et al.Fast convergence of detr with spatially modulated co-attention[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3621-3630.
[77]SUN Z,CAO S,YANG Y,et al.Rethinking transformer-based set prediction for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3611-3620.
[78]DAI X,CHEN Y,YANG J,et al.Dynamic detr:End-to-end object detection with dynamic attention[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:2988-2997.
[79]LI F,ZHANG H,LIU S,et al.Dn-detr:Accelerate detr training by introducing query denoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:13619-13627.
[80]DAI Z,CAI B,LIN Y,et al.Up-detr:Unsupervised pre-training for object detection with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:1601-1610.
[81]ZHENG M,GAO P,ZHANG R,et al.End-to-end object detection with adaptive clustering transformer[J].arXiv:2011.09315,2020.
[82]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[83]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[84]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[85]BEAL J,KIM E,TZENG E,et al.Toward transformer-basedobject detection[J].arXiv:2012.09958,2020.
[86]FANG Y,LIAO B,WANG X,et al.You only look at one se-quence:Rethinking transformer in vision through object detection[J].Advances in Neural Information Processing Systems,2021,34:26183-26197.
[87]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[88]HE K,GKIOXARI G,DOLLÁR P,et al.Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[89]TIAN X,WANG L,DING Q.Review of Image Semantic Se-gmentation Based on Deep Learning[J].Journal of Software,2019,30(2):440-468.
[90]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[91]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[92]PENG C,ZHANG X,YU G,et al.Large kernel matters--im-prove semantic segmentation by global convolutional network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4353-4361.
[93]LI X,ZHANG L,YOU A,et al.Global aggregation then local distribution in fully convolutional networks[J].arXiv:1909.07229,2019.
[94]STRUDEL R,GARCIA R,LAPTEV I,et al.Segmenter:Transformer for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:7262-7272.
[95]WU S,WU T,LIN F,et al.Fully transformer networks for semantic image segmentation[J].arXiv:2106.04108,2021.
[96]XIE E,WANG W,YU Z,et al.SegFormer:Simple and efficient design for semantic segmentation with transformers[J].Advances in Neural Information Processing Systems,2021,34:12077-12090.
[97]SU L,SUN Y X,YUAN S Z.A Survey of Instance Segmentation Based on Deep Learning[J].CAAI Transactions on Intelligent Systems,2022,17(1):16-31.
[98]HU J,CAO L,LU Y,et al.Istr:End-to-end instance segmentation with transformers[J].arXiv:2105.00637,2021.
[99]GUO R,NIU D,QU L,et al.Sotr:Segmenting objects withtransformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:7157-7166.
[100]WANG Y,XU Z,WANG X,et al.End-to-end video instance segmentation with transformer[C]//Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition.2021:8741-8750.
[101]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.2015:234-241.
[102]OKTAY O,SCHLEMPER J,FOLGOC L L,et al.Attentionu-net:Learning where to look for the pancreas[J].arXiv:1804.03999,2018.
[103]CHANG Y,HU M H,ZHAI G T,et al.Transclaw u-net:Claw u-net with transformers for medical image segmentation[J].arXiv:2107.05188,2021.
[104]YAO C,TANG J,HU M,et al.Claw u-net:A unet-based net-work with deep feature concatenation for scleral blood vessel segmentation[J].arXiv:2010.10163,2020.
[105]XU G,WU X,ZHANG X,et al.Levit-unet:Make faster encoders with transformer for medical image segmentation[J].arXiv:2170.08623,2021.
[106]PETIT O,THOME N,RAMBOUR C,et al.U-net transformer:Self and cross attention for medical image segmentation[C]//International Workshop on Machine Learning in Medical Imaging.2021:267-276.
[107]GAO Y,ZHOU M,METAXAS D N.UTNet:a hybrid transformer architecture for medical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-Assisted Intervention.2021:61-71.
[108]ZHANG Y,LIU H,HU Q.Transfuse:Fusing transformers and cnns for medical image segmentation[C]//International Confe-rence on Medical Image Computing and Computer-Assisted Intervention.2021:14-24.
[109]ZHOU H Y,GUO J,ZHANG Y,et al.nnFormer:Interleaved Transformer for Volumetric Segmentation[J].arXiv:2109.03201,2021.
[110]XIE Y,ZHANG J,SHEN C,et al.Cotr:Efficiently bridging cnn and transformer for 3d medical image segmentation[C]//Internation Alconference on Medical Image Computing and Compu-ter-assisted Intervention.2021:171-180.
[111]HATAMIZADEH A,TANG Y,NATH V,et al.Unetr:Transformers for 3d medical image segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2022:574-584.
[112]VALANARASU J M J,OZA P,HACIHALILOGLU I,et al.Medical transformer:Gated axial-attention for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.2021:36-46.
[113]KARIMI D,VASYLECHKO S D,GHOLIPOUR A.Convolution-free medical image segmentation using transformers[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.2021:78-88.
[114]CAO H,WANG Y,CHEN J,et al.Swin-unet:Unet-like puretransformer for medical image segmentation[J].arXiv:2105.05537,2021.
[115]LIN A,CHEN B,XU J,et al.DS-TransUNet:Dual swin Transformer U-Net for medical image segmentation[J].arXiv:2106.06716,2021.
[1] YANG Yi, SHEN Sheng, DOU Zhiyang, LI Yuan, HAN Zhenjun. Tiny Person Detection for Intelligent Video Surveillance [J]. Computer Science, 2023, 50(9): 75-81.
[2] ZHU Ye, HAO Yingguang, WANG Hongyu. Deep Learning Based Salient Object Detection in Infrared Video [J]. Computer Science, 2023, 50(9): 227-234.
[3] LIU Yubo, GUO Bin, MA Ke, QIU Chen, LIU Sicong. Design of Visual Context-driven Interactive Bot System [J]. Computer Science, 2023, 50(9): 260-268.
[4] SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[5] WANG Xu, WU Yanxia, ZHANG Xue, HONG Ruize, LI Guangsheng. Survey of Rotating Object Detection Research in Computer Vision [J]. Computer Science, 2023, 50(8): 79-92.
[6] ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[7] HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving [J]. Computer Science, 2023, 50(7): 107-118.
[8] QIN Jing, WANG Weibin, ZOU Qijie, WANG Zumin, JI Changqing. Review of 3D Target Detection Methods Based on LiDAR Point Clouds [J]. Computer Science, 2023, 50(6A): 220400214-7.
[9] QI Xuanlong, CHEN Hongyang, ZHAO Wenbing, ZHAO Di, GAO Jingyang. Study on BGA Packaging Void Rate Detection Based on Active Learning and U-Net++ Segmentation [J]. Computer Science, 2023, 50(6A): 220200092-6.
[10] WANG Guogang, WU Yan, LIU Yibo. Target Detection Algorithm Based on Compound Scaling Deep Iterative CNN by RegressionConverging and Scaling Mixture [J]. Computer Science, 2023, 50(6A): 220500230-9.
[11] LI Fan, JIA Dongli, YAO Yumin, TU Jun. Graph Neural Network Few Shot Image Classification Network Based on Residual and Self-attention Mechanism [J]. Computer Science, 2023, 50(6A): 220500104-5.
[12] LIU Yao, GUAN Lihe. Superpixel Segmentation Iterative Algorithm Based on Ball-k-means Clustering [J]. Computer Science, 2023, 50(6A): 220600114-7.
[13] SUN Kaiwei, WANG Zhihao, LIU Hu, RAN Xue. Maximum Overlap Single Target Tracking Algorithm Based on Attention Mechanism [J]. Computer Science, 2023, 50(6A): 220400023-5.
[14] YANG Jingyi, LI Fang, KANG Xiaodong, WANG Xiaotian, LIU Hanqing, HAN Junling. Ultrasonic Image Segmentation Based on SegFormer [J]. Computer Science, 2023, 50(6A): 220400273-6.
[15] WU Liuchen, ZHANG Hui, LIU Jiaxuan, ZHAO Chenyang. Defect Detection of Transmission Line Bolt Based on Region Attention Mechanism andMulti-scale Feature Fusion [J]. Computer Science, 2023, 50(6A): 220200096-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!