计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 155-160.doi: 10.11896/jsjkx.220400035

• 计算机图形学&多媒体 • 上一篇    下一篇

基于Swin Transformer和三维残差多层融合网络的高光谱图像分类

王先旺, 周浩, 张明慧, 朱尤伟   

  1. 云南大学信息学院 昆明 650500
  • 收稿日期:2022-04-06 修回日期:2022-07-04 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 周浩(zhouhao@ynu.edu.cn)
  • 作者简介:(1357107853@qq.com)
  • 基金资助:
    云南省重大科技专项(202202AD080004);国家自然科学基金(12263008)

Hyperspectral Image Classification Based on Swin Transformer and 3D Residual Multilayer Fusion Network

WANG Xianwang, ZHOU Hao, ZHANG Minghui, ZHU Youwei   

  1. School of Information,Yunnan University,Kunming 650500,China
  • Received:2022-04-06 Revised:2022-07-04 Online:2023-05-15 Published:2023-05-06
  • About author:WANG Xianwang,born in 1995,postgraduate.His main research interests include hyper-spectral image classification and so on.
    ZHOU Hao,born in 1972,Ph.D,asso-ciate professor.His main research in-terests include digital image processing and computer vision.
  • Supported by:
    District Key Project of Yunnan Province(202202AD080004) and National Natural Science Foundation of China(12263008).

摘要: 卷积神经网络(CNNs)具有出色的局部上下文建模能力,被广泛用于高光谱图像分类中,但由于其固有网络主干的局限性,CNNs未能很好地挖掘和表示光谱特征的序列属性。为了解决此问题,提出了一种基于Swin Transformer和三维残差多层融合网络的新型网络(ReSTrans)用于高光谱图像分类。在ReSTrans网络中,为了尽可能地挖掘高光谱图像的深层特征,采用三维残差多层融合网络来提取空谱特征,然后由基于自注意机制的Swin Transformer网络模块近一步捕获连续光谱间的关系,最后由多层感知机根据空谱联合特征完成最终的分类任务。为了验证ReSTrans网络模型的有效性,改进的模型在IP,UP和KSC 3个高光谱数据集上进行实验验证,分类精度分别达到了98.65%,99.64%,99.78%。与SST方法相比,该网络模型的分类性能分别平均提高了3.55%,0.68%,1.87%。实验结果表明该模型具有很好的泛化能力,可以提取更深层的、判别性的特征。

关键词: 高光谱图像分类, 三维残差多层融合网络, 自注意力机制, Swin Transformer, 空谱联合特征

Abstract: Convolutional neural networks(CNNs) are widely seen in in hyperspectral image classification due to their remarkably good local context modeling performance.However,under its inherent limitations of network structure,it fails to exploit and represent sequence attributes from spectral characteristics.To address this problem,an integrated novel network,based on Swin Transformer and 3D residual multi-layer fusion network model,is proposed for hyperspectral image classification.In order to excavate the deep features of hyperspectral images as much as possible,spatial spectrum is extracted by improved 3D residual multi-layer fusion network in ReSTrans network,and the context information in consecutive spectra is captured by self-attention mecha-nism Swin Transformer network model.The final result of classification is obtained by multi-layer perception based on spatial spectrum joint feature.In order to verify the effectiveness of the ReSTrans network model,the improved model is experimentally verified on three hyperspectral data sets of IP,UP and KSC,and the classification accuracy reaches 98.65 %,99.64% and 99.78% respectively.Compared with SST method,the classification performance of the network model improves by 3.55%,0.68% and 1.87% respectively.Experimental results show that the model had good generalization ability and could extract deeper and discriminative features.

Key words: Hyperspectral image classification, 3D residual multilayer fusion network, Self-attention mechanism, Swin Transfor-mer, Spatial spectrum joint feature

中图分类号: 

  • TP751.1
[1]REN S G,WAN S,GU X J,et al.Hyper-spectral image classifi-cation based on multi-scale spatial spectrum identification features[J].Computer Science,2018,45(12):243-250.
[2]ZHU N,LI M.Multilevel selective kernel convolution for retina image classification[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2022,34(5):886-893.
[3]LV W,WANG X.Overview of Hyperspectral Image Classification[J].Journal of Sensors,2020,2020(2):1-13.
[4]HAUT J M,PAOLETTI M E,PLAZA J,et al.Visual attention-driven hyperspectral image classification[J].IEEE Transactions on Geos-cience and Remote Sensing,2019,57(10):8065-8080.
[5]WEI X P,YU X C,TAN X,et al.CNN and 3D Gabor filter for hyperspectral image classifica-tion[J].Journal of Computer Aided Design and Graphics,2020,32(1):90-98.
[6]HE M,LI B,CHEN H,et al.Multi-scale 3D deep convolutional neural network for hyperspectral image classification[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:3904-3908.
[7]HANG R,LIU Q,HONG D,et al.Cascaded recurrent neural networks for hyperspectral image classification[J].IEEE Transac-tions on Geoscience and Remote Sensing,2019,57(8):5384-5394.
[8]MÜLLER G,RIOS M,SENNRICH A,et al.Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures[J].arXiv:1808.08946,2018.
[9]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with trans-formers[C]//European Conference on Computer Vision.Berlin:Springer,2020:213-219.
[10]RAMACHANDRAN P,PARMAR N,VASWANI A,et al.Stand-Alone Self-Attention in Vision Models[J].arXiv:1906.05909,2019.
[11]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[12]HU W,HUANG Y Y,LI H C,et al.Deep Convolutional Neural Networks for Hyper-spectral Image Classification[J].Journal of Sensors,2015,2015:1-12.
[13]LIU,B,YU X C,ZHANG P Q,et al,A semi-supervised convolutional neural network for hyper-spectral image classification[J].Remote Sensing Letters,2017,8(9):839-848.
[14]HAMID A B,BENOIT A,LAMBERT P,et al.3D deep learningapproach for remote sensing image classification[J].IEEE Transactions on Geoscience and Remote Sensing,2018,56(8):4420-4434.
[15]HARA K,KATAOKA H,SATOH Y.Learning spatio-temporal features with 3d residual networks for action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2017.
[16]HE X,CHEN Y,LIN Z.Spatial-spectral transformer for hyperspectral image classification[J].Remote Sensing,2021,13(3):498.
[1] 胡康琦, 马武彬, 戴超凡, 吴亚辉, 周浩浩.
一种基于改进NSGA-III的联邦学习进化多目标优化算法
Federated Learning Evolutionary Multi-objective Optimization Algorithm Based on Improved NSGA-III
计算机科学, 2025, 52(3): 152-160 . https://doi.org/10.11896/jsjkx.240600014
[2] 李文旺, 周浩浩, 邓苏, 马武彬, 吴亚辉.
面向车辆边缘计算任务卸载的延迟与能耗联合优化方法
Joint Optimization of Delay and Energy Consumption of Tasks Offloading for Vehicular EdgeComputing
计算机科学, 2024, 51(11A): 231000080-7 . https://doi.org/10.11896/jsjkx.231000080
[3] 任禹衡, 赵云峰, 吴闯.
基于相对位置编码转换器模块的深度步态识别网络
Deep Gait Recognition Network Based on Relative Position Encoding Transformer
计算机科学, 2024, 51(11A): 240400064-6 . https://doi.org/10.11896/jsjkx.240400064
[4] 李婷, 赵尔敦, 杨军.
基于自注意力与双向特征融合的道路障碍物检测方法
Road Obstacle Detection Method Based on Self-attention and Bidirectional Feature Fusion
计算机科学, 2024, 51(11A): 240100138-5 . https://doi.org/10.11896/jsjkx.240100138
[5] 陈冬, 周浩, 袁国武, 杨凌宇, 成秋艳, 任莹, 马仪.
融合多尺度特征与位置信息的输电线路山火检测算法
Mountain Fire Detection Algorithm of Transmission Line Based on Multi-scale Features and Coordinate Information
计算机科学, 2024, 51(11A): 230900155-7 . https://doi.org/10.11896/jsjkx.230900155
[6] 周浩, 罗廷金, 崔国恒.
结合对象属性识别的图像场景图生成方法研究
Scene Graph Generation Combined with Object Attribute Recognition
计算机科学, 2024, 51(11): 205-212 . https://doi.org/10.11896/jsjkx.230900013
[7] 周雪阳, 傅启明, 陈建平, 陆悠, 王蕴哲.
化学物质诱导疾病关系抽取:基于证据聚焦的图推理方法
Chemical-induced Disease Relation Extraction:Graph Reasoning Method Based on Evidence Focusing
计算机科学, 2024, 51(10): 351-361 . https://doi.org/10.11896/jsjkx.230800111
[8] 陈思雨, 马海龙, 张建辉.
基于注意力机制的CNN和BiGRU的加密流量分类
Encrypted Traffic Classification of CNN and BiGRU Based on Self-attention
计算机科学, 2024, 51(8): 396-402 . https://doi.org/10.11896/jsjkx.230500032
[9] 李永杰, 钱艺, 文益民.
基于外部先验和自先验注意力的图像描述生成方法
Image Captioning Generation Method Based on External Prior and Self-prior Attention
计算机科学, 2024, 51(7): 214-220 . https://doi.org/10.11896/jsjkx.230600167
[10] 李嘉莹, 梁宇栋, 李少吉, 张昆鹏, 张超.
基于彩色图像高频信息引导的深度图超分辨率重建算法研究
Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images
计算机科学, 2024, 51(7): 197-205 . https://doi.org/10.11896/jsjkx.230400102
[11] 张兰昕, 向玲, 李显泽, 陈锦鹏.
基于SAMNV3的滚动轴承智能故障诊断方法
Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3
计算机科学, 2024, 51(6A): 230700167-6 . https://doi.org/10.11896/jsjkx.230700167
[12] 阙越, 甘梦晗, 刘志伟.
感受野扩展与多分支聚合的目标检测方法
Object Detection with Receptive Field Expansion and Multi-branch Aggregation
计算机科学, 2024, 51(6A): 230600151-6 . https://doi.org/10.11896/jsjkx.230600151
[13] 刘小湖, 陈德富, 李俊, 周旭文, 胡姗, 周浩.
基于多尺度卷积编码器的说话人验证网络
Speaker Verification Network Based on Multi-scale Convolutional Encoder
计算机科学, 2024, 51(6A): 230700083-6 . https://doi.org/10.11896/jsjkx.230700083
[14] 徐奕成, 戴超凡, 马武彬, 吴亚辉, 周浩浩, 鲁晨阳.
基于粒子群优化的面向数据异构的联邦学习方法
Particle Swarm Optimization-based Federated Learning Method for Heterogeneous Data
计算机科学, 2024, 51(6): 391-398 . https://doi.org/10.11896/jsjkx.230400182
[15] 刘家森, 黄俊.
基于改进Swin Transformer的中心点目标检测算法
Center Point Target Detection Algorithm Based on Improved Swin Transformer
计算机科学, 2024, 51(6): 264-271 . https://doi.org/10.11896/jsjkx.230300222
Viewed
Full text
176
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 176

  From Others local
  Times 4 172
  Rate 2% 98%

Abstract
384
Just accepted Online first Issue
0 0 383
  From local
  Times 383
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!