计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 155-160.doi: 10.11896/jsjkx.220400035

• 计算机图形学&多媒体 • 上一篇    下一篇

基于Swin Transformer和三维残差多层融合网络的高光谱图像分类

王先旺, 周浩, 张明慧, 朱尤伟   

  1. 云南大学信息学院 昆明 650500
  • 收稿日期:2022-04-06 修回日期:2022-07-04 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 周浩(zhouhao@ynu.edu.cn)
  • 作者简介:(1357107853@qq.com)
  • 基金资助:
    云南省重大科技专项(202202AD080004);国家自然科学基金(12263008)

Hyperspectral Image Classification Based on Swin Transformer and 3D Residual Multilayer Fusion Network

WANG Xianwang, ZHOU Hao, ZHANG Minghui, ZHU Youwei   

  1. School of Information,Yunnan University,Kunming 650500,China
  • Received:2022-04-06 Revised:2022-07-04 Online:2023-05-15 Published:2023-05-06
  • About author:WANG Xianwang,born in 1995,postgraduate.His main research interests include hyper-spectral image classification and so on.
    ZHOU Hao,born in 1972,Ph.D,asso-ciate professor.His main research in-terests include digital image processing and computer vision.
  • Supported by:
    District Key Project of Yunnan Province(202202AD080004) and National Natural Science Foundation of China(12263008).

摘要: 卷积神经网络(CNNs)具有出色的局部上下文建模能力,被广泛用于高光谱图像分类中,但由于其固有网络主干的局限性,CNNs未能很好地挖掘和表示光谱特征的序列属性。为了解决此问题,提出了一种基于Swin Transformer和三维残差多层融合网络的新型网络(ReSTrans)用于高光谱图像分类。在ReSTrans网络中,为了尽可能地挖掘高光谱图像的深层特征,采用三维残差多层融合网络来提取空谱特征,然后由基于自注意机制的Swin Transformer网络模块近一步捕获连续光谱间的关系,最后由多层感知机根据空谱联合特征完成最终的分类任务。为了验证ReSTrans网络模型的有效性,改进的模型在IP,UP和KSC 3个高光谱数据集上进行实验验证,分类精度分别达到了98.65%,99.64%,99.78%。与SST方法相比,该网络模型的分类性能分别平均提高了3.55%,0.68%,1.87%。实验结果表明该模型具有很好的泛化能力,可以提取更深层的、判别性的特征。

关键词: 高光谱图像分类, 三维残差多层融合网络, 自注意力机制, Swin Transformer, 空谱联合特征

Abstract: Convolutional neural networks(CNNs) are widely seen in in hyperspectral image classification due to their remarkably good local context modeling performance.However,under its inherent limitations of network structure,it fails to exploit and represent sequence attributes from spectral characteristics.To address this problem,an integrated novel network,based on Swin Transformer and 3D residual multi-layer fusion network model,is proposed for hyperspectral image classification.In order to excavate the deep features of hyperspectral images as much as possible,spatial spectrum is extracted by improved 3D residual multi-layer fusion network in ReSTrans network,and the context information in consecutive spectra is captured by self-attention mecha-nism Swin Transformer network model.The final result of classification is obtained by multi-layer perception based on spatial spectrum joint feature.In order to verify the effectiveness of the ReSTrans network model,the improved model is experimentally verified on three hyperspectral data sets of IP,UP and KSC,and the classification accuracy reaches 98.65 %,99.64% and 99.78% respectively.Compared with SST method,the classification performance of the network model improves by 3.55%,0.68% and 1.87% respectively.Experimental results show that the model had good generalization ability and could extract deeper and discriminative features.

Key words: Hyperspectral image classification, 3D residual multilayer fusion network, Self-attention mechanism, Swin Transfor-mer, Spatial spectrum joint feature

中图分类号: 

  • TP751.1
[1]REN S G,WAN S,GU X J,et al.Hyper-spectral image classifi-cation based on multi-scale spatial spectrum identification features[J].Computer Science,2018,45(12):243-250.
[2]ZHU N,LI M.Multilevel selective kernel convolution for retina image classification[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2022,34(5):886-893.
[3]LV W,WANG X.Overview of Hyperspectral Image Classification[J].Journal of Sensors,2020,2020(2):1-13.
[4]HAUT J M,PAOLETTI M E,PLAZA J,et al.Visual attention-driven hyperspectral image classification[J].IEEE Transactions on Geos-cience and Remote Sensing,2019,57(10):8065-8080.
[5]WEI X P,YU X C,TAN X,et al.CNN and 3D Gabor filter for hyperspectral image classifica-tion[J].Journal of Computer Aided Design and Graphics,2020,32(1):90-98.
[6]HE M,LI B,CHEN H,et al.Multi-scale 3D deep convolutional neural network for hyperspectral image classification[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:3904-3908.
[7]HANG R,LIU Q,HONG D,et al.Cascaded recurrent neural networks for hyperspectral image classification[J].IEEE Transac-tions on Geoscience and Remote Sensing,2019,57(8):5384-5394.
[8]MÜLLER G,RIOS M,SENNRICH A,et al.Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures[J].arXiv:1808.08946,2018.
[9]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with trans-formers[C]//European Conference on Computer Vision.Berlin:Springer,2020:213-219.
[10]RAMACHANDRAN P,PARMAR N,VASWANI A,et al.Stand-Alone Self-Attention in Vision Models[J].arXiv:1906.05909,2019.
[11]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[12]HU W,HUANG Y Y,LI H C,et al.Deep Convolutional Neural Networks for Hyper-spectral Image Classification[J].Journal of Sensors,2015,2015:1-12.
[13]LIU,B,YU X C,ZHANG P Q,et al,A semi-supervised convolutional neural network for hyper-spectral image classification[J].Remote Sensing Letters,2017,8(9):839-848.
[14]HAMID A B,BENOIT A,LAMBERT P,et al.3D deep learningapproach for remote sensing image classification[J].IEEE Transactions on Geoscience and Remote Sensing,2018,56(8):4420-4434.
[15]HARA K,KATAOKA H,SATOH Y.Learning spatio-temporal features with 3d residual networks for action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2017.
[16]HE X,CHEN Y,LIN Z.Spatial-spectral transformer for hyperspectral image classification[J].Remote Sensing,2021,13(3):498.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!