计算机科学 ›› 2024, Vol. 51 ›› Issue (2): 189-195.doi: 10.11896/jsjkx.221100218
王文杰, 杨燕, 敬丽丽, 王杰, 刘言
WANG Wenjie, YANG Yan, JING Lili, WANG Jie, LIU Yan
摘要: 鉴于Transformer的Self-Attention机制具有优秀的表征能力,许多研究者提出了基于Self-Attention机制的图像处理模型,并取得了巨大成功。然而,基于Self-Attention的传统图像分类网络无法兼顾全局信息和计算复杂度,限制了Self-Attention的广泛应用。文中提出了一种有效的、可扩展的注意力模块Local Neighbor Global Self-Attention(LNG-SA),该模块在任意时期都能进行局部信息、邻居信息和全局信息的交互。通过重复级联LNG-SA模块,设计了一个全新的网络,称为LNG-Transformer。该网络整体采用层次化结构,具有优秀的灵活性,其计算复杂度与图像分辨率呈线性关系。LNG-SA模块的特性使得LNG-Transformer即使在早期的高分辨率阶段,也可以进行局部信息、邻居信息和全局信息的交互,从而带来更高的效率、更强的学习能力。实验结果表明,LNG-Transformer在图像分类任务中具有良好的性能。
中图分类号:
[1]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90. [2]DING X,ZHANG X,HAN J,et al.Scaling up your kernels to 31×31:Revisiting large kernel design in cnns[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11963-11975. [3]SZEGEDY C,LIU W,JIA Y,et al.Going deeper withconvolu-tions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9. [4]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [5]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708. [6]ANDREW G H,MENGLONG Z,BO C,et al.MobileNets:Efficient Convolutional Neural Networks for Mobile Vision Applications[J].arXiv:1704.04861,2017. [7]FISHER Y,VLADLEN K.Multi-Scale Context Aggregation by Dilated Convolutions[J].arXiv:1511.07122,2015. [8]DING X,ZHANG X,MA N,et al.Repvgg:Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742. [9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing SystemsDecember.2017:6000-6010. [10]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141. [11]HAN X L,CHEN J C,ZHOU W S.New MobileNet image classification algorithm based on 3D attention[J].Journal of Chongqing University of Post and Telecommunications(Natural Science Edition),2023,35(3):513-519. [12]ZHAO H W,ZHANG J R,ZHU J P,et al.Image classification framework based on contrastive self-supervised learning[J].Journal of Jilin University(Engineering and Technology Edition),2022,52(8):1850-1856. [13]张辉宜,夏媛龙,周克武,等.一种融合标签间强相关性的多标签图像分类方法[J].重庆工商大学学报(自然科学版),2023,40(5):8-15. [14]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2019. [15]ALEXEY D,LUCAS B,ALEXANDER K,et al.An Image is Worth 16×16 Words:Transformers for Image Recognition at Scale[C]//International Conference on Learning Representations.2021. [16]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using neighbored windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022. [17]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [18]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90. [19]XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1492-1500. [20]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856. [21]DAN H,KEVIN G.Gaussian Error Linear Units(GELUs)[J].arXiv:1606.08415,2016. [22]LEI J B,RYAN K,GEOFFREY E H,et al.Layer Normalization[J].arXiv:1607.06450,2016. [23]ILYA L,FRANK H.Decoupled Weight Decay Regularization[J].arXiv:1711.05101,2019. [24]HONGYI Z,MOUSTAPHA C,YANN N D,et al.mixup:Beyond Empirical Risk Minimization[J].arXiv:1710.09412,23018. |
|