计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 155-160.doi: 10.11896/jsjkx.220400035

• 计算机图形学&多媒体 • 上一篇    下一篇

基于Swin Transformer和三维残差多层融合网络的高光谱图像分类

王先旺, 周浩, 张明慧, 朱尤伟   

  1. 云南大学信息学院 昆明 650500
  • 收稿日期:2022-04-06 修回日期:2022-07-04 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 周浩(zhouhao@ynu.edu.cn)
  • 作者简介:(1357107853@qq.com)
  • 基金资助:
    云南省重大科技专项(202202AD080004);国家自然科学基金(12263008)

Hyperspectral Image Classification Based on Swin Transformer and 3D Residual Multilayer Fusion Network

WANG Xianwang, ZHOU Hao, ZHANG Minghui, ZHU Youwei   

  1. School of Information,Yunnan University,Kunming 650500,China
  • Received:2022-04-06 Revised:2022-07-04 Online:2023-05-15 Published:2023-05-06
  • About author:WANG Xianwang,born in 1995,postgraduate.His main research interests include hyper-spectral image classification and so on.
    ZHOU Hao,born in 1972,Ph.D,asso-ciate professor.His main research in-terests include digital image processing and computer vision.
  • Supported by:
    District Key Project of Yunnan Province(202202AD080004) and National Natural Science Foundation of China(12263008).

摘要: 卷积神经网络(CNNs)具有出色的局部上下文建模能力,被广泛用于高光谱图像分类中,但由于其固有网络主干的局限性,CNNs未能很好地挖掘和表示光谱特征的序列属性。为了解决此问题,提出了一种基于Swin Transformer和三维残差多层融合网络的新型网络(ReSTrans)用于高光谱图像分类。在ReSTrans网络中,为了尽可能地挖掘高光谱图像的深层特征,采用三维残差多层融合网络来提取空谱特征,然后由基于自注意机制的Swin Transformer网络模块近一步捕获连续光谱间的关系,最后由多层感知机根据空谱联合特征完成最终的分类任务。为了验证ReSTrans网络模型的有效性,改进的模型在IP,UP和KSC 3个高光谱数据集上进行实验验证,分类精度分别达到了98.65%,99.64%,99.78%。与SST方法相比,该网络模型的分类性能分别平均提高了3.55%,0.68%,1.87%。实验结果表明该模型具有很好的泛化能力,可以提取更深层的、判别性的特征。

关键词: 高光谱图像分类, 三维残差多层融合网络, 自注意力机制, Swin Transformer, 空谱联合特征

Abstract: Convolutional neural networks(CNNs) are widely seen in in hyperspectral image classification due to their remarkably good local context modeling performance.However,under its inherent limitations of network structure,it fails to exploit and represent sequence attributes from spectral characteristics.To address this problem,an integrated novel network,based on Swin Transformer and 3D residual multi-layer fusion network model,is proposed for hyperspectral image classification.In order to excavate the deep features of hyperspectral images as much as possible,spatial spectrum is extracted by improved 3D residual multi-layer fusion network in ReSTrans network,and the context information in consecutive spectra is captured by self-attention mecha-nism Swin Transformer network model.The final result of classification is obtained by multi-layer perception based on spatial spectrum joint feature.In order to verify the effectiveness of the ReSTrans network model,the improved model is experimentally verified on three hyperspectral data sets of IP,UP and KSC,and the classification accuracy reaches 98.65 %,99.64% and 99.78% respectively.Compared with SST method,the classification performance of the network model improves by 3.55%,0.68% and 1.87% respectively.Experimental results show that the model had good generalization ability and could extract deeper and discriminative features.

Key words: Hyperspectral image classification, 3D residual multilayer fusion network, Self-attention mechanism, Swin Transfor-mer, Spatial spectrum joint feature

中图分类号: 

  • TP751.1
[1]REN S G,WAN S,GU X J,et al.Hyper-spectral image classifi-cation based on multi-scale spatial spectrum identification features[J].Computer Science,2018,45(12):243-250.
[2]ZHU N,LI M.Multilevel selective kernel convolution for retina image classification[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2022,34(5):886-893.
[3]LV W,WANG X.Overview of Hyperspectral Image Classification[J].Journal of Sensors,2020,2020(2):1-13.
[4]HAUT J M,PAOLETTI M E,PLAZA J,et al.Visual attention-driven hyperspectral image classification[J].IEEE Transactions on Geos-cience and Remote Sensing,2019,57(10):8065-8080.
[5]WEI X P,YU X C,TAN X,et al.CNN and 3D Gabor filter for hyperspectral image classifica-tion[J].Journal of Computer Aided Design and Graphics,2020,32(1):90-98.
[6]HE M,LI B,CHEN H,et al.Multi-scale 3D deep convolutional neural network for hyperspectral image classification[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:3904-3908.
[7]HANG R,LIU Q,HONG D,et al.Cascaded recurrent neural networks for hyperspectral image classification[J].IEEE Transac-tions on Geoscience and Remote Sensing,2019,57(8):5384-5394.
[8]MÜLLER G,RIOS M,SENNRICH A,et al.Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures[J].arXiv:1808.08946,2018.
[9]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with trans-formers[C]//European Conference on Computer Vision.Berlin:Springer,2020:213-219.
[10]RAMACHANDRAN P,PARMAR N,VASWANI A,et al.Stand-Alone Self-Attention in Vision Models[J].arXiv:1906.05909,2019.
[11]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[12]HU W,HUANG Y Y,LI H C,et al.Deep Convolutional Neural Networks for Hyper-spectral Image Classification[J].Journal of Sensors,2015,2015:1-12.
[13]LIU,B,YU X C,ZHANG P Q,et al,A semi-supervised convolutional neural network for hyper-spectral image classification[J].Remote Sensing Letters,2017,8(9):839-848.
[14]HAMID A B,BENOIT A,LAMBERT P,et al.3D deep learningapproach for remote sensing image classification[J].IEEE Transactions on Geoscience and Remote Sensing,2018,56(8):4420-4434.
[15]HARA K,KATAOKA H,SATOH Y.Learning spatio-temporal features with 3d residual networks for action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2017.
[16]HE X,CHEN Y,LIN Z.Spatial-spectral transformer for hyperspectral image classification[J].Remote Sensing,2021,13(3):498.
[1] 杨斌, 梁婧, 周佳薇, 赵梦赐.
基于注意力机制的可解释点击率预估模型研究
Study on Interpretable Click-Through Rate Prediction Based on Attention Mechanism
计算机科学, 2023, 50(5): 12-20. https://doi.org/10.11896/jsjkx.221000032
[2] 王鹏宇, 台文鑫, 刘芳, 钟婷, 罗绪成, 周帆.
基于数据增强的自监督飞行航迹预测
Self-supervised Flight Trajectory Prediction Based on Data Augmentation
计算机科学, 2023, 50(2): 130-137. https://doi.org/10.11896/jsjkx.211200016
[3] 张婧媛, 王宏霞, 何沛松.
基于Transformer的多任务图像拼接篡改检测算法
Multitask Transformer-based Network for Image Splicing Manipulation Detection
计算机科学, 2023, 50(1): 114-122. https://doi.org/10.11896/jsjkx.211100269
[4] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[5] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[6] 赵丹丹, 黄德根, 孟佳娜, 董宇, 张攀.
基于BERT-GRU-ATT模型的中文实体关系分类
Chinese Entity Relations Classification Based on BERT-GRU-ATT
计算机科学, 2022, 49(6): 319-325. https://doi.org/10.11896/jsjkx.210600123
[7] 胡艳丽, 童谭骞, 张啸宇, 彭娟.
融入自注意力机制的深度学习情感分析方法
Self-attention-based BGRU and CNN for Sentiment Analysis
计算机科学, 2022, 49(1): 252-258. https://doi.org/10.11896/jsjkx.210600063
[8] 徐少伟, 秦品乐, 曾建朝, 赵致楷, 高媛, 王丽芳.
基于多级特征和全局上下文的纵膈淋巴结分割算法
Mediastinal Lymph Node Segmentation Algorithm Based on Multi-level Features and Global Context
计算机科学, 2021, 48(6A): 95-100. https://doi.org/10.11896/jsjkx.200700067
[9] 王习, 张凯, 李军辉, 孔芳, 张熠天.
联合自注意力和循环网络的图像标题生成
Generation of Image Caption of Joint Self-attention and Recurrent Neural Network
计算机科学, 2021, 48(4): 157-163. https://doi.org/10.11896/jsjkx.200300146
[10] 周小诗, 张梓葳, 文娟.
基于神经网络机器翻译的自然语言信息隐藏
Natural Language Steganography Based on Neural Machine Translation
计算机科学, 2021, 48(11A): 557-564. https://doi.org/10.11896/jsjkx.210100015
[11] 王燕, 王丽.
面向高光谱图像分类的局部Gabor卷积神经网络
Local Gabor Convolutional Neural Network for Hyperspectral Image Classification
计算机科学, 2020, 47(6): 151-156. https://doi.org/10.11896/jsjkx.190500147
[12] 张鹏飞, 李冠宇, 贾彩燕.
面向自然语言推理的基于截断高斯距离的自注意力机制
Truncated Gaussian Distance-based Self-attention Mechanism for Natural Language Inference
计算机科学, 2020, 47(4): 178-183. https://doi.org/10.11896/jsjkx.190600149
[13] 康雁,崔国荣,李浩,杨其越,李晋源,王沛尧.
融合自注意力机制和多路金字塔卷积的软件需求聚类算法
Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution
计算机科学, 2020, 47(3): 48-53. https://doi.org/10.11896/jsjkx.190700146
[14] 张义杰, 李培峰, 朱巧明.
基于自注意力机制的事件时序关系分类方法
Event Temporal Relation Classification Method Based on Self-attention Mechanism
计算机科学, 2019, 46(8): 244-248. https://doi.org/10.11896/j.issn.1002-137X.2019.08.040
[15] 凡子威, 张民, 李正华.
基于BiLSTM并结合自注意力机制和句法信息的隐式篇章关系分类
BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information
计算机科学, 2019, 46(5): 214-220. https://doi.org/10.11896/j.issn.1002-137X.2019.05.033
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!