计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240300071-9.doi: 10.11896/jsjkx.240300071

• 图像处理&多媒体技术 • 上一篇    下一篇

基于高斯偏置自注意力和交叉注意力的医学图像分割模型

罗会兰, 郭宇辰   

  1. 江西理工大学信息工程学院 江西 赣州 341000
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 郭宇辰(dodge016@qq.com)
  • 作者简介:(luohuilan@sina.com)
  • 基金资助:
    国家自然科学基金(61862031);江西省自然科学基金重点项目(20232ACB202011);江西省主要学科学术和技术带头人培养计划——领军人才项目(20213BCJ22004)

Gaussian-bias Self-attention and Cross-attention Based Module for Medical Image Segmentation

LUO Huilan, GUO Yuchen   

  1. School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou,Jiangxi 341000,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:LUO Huilan,born in 1974,professor,master's supervisor.Her main research interests include machine learning and pattern recognition.
    GUO Yuchen,born in 1998,postgra-duate.His main research interest is computer vision.
  • Supported by:
    National Natural Science Foundation of China(61862031),Natural Science Foundation of Jiangxi Province,China(20232ACB202011) and Leading Talents Plan for the Technical Leaders of Major Disciplines in Jiangxi Province(20213BCJ22004).

摘要: 为解决医学图像分割中目标之间存在特征差异、不同切片图像中存在同一解剖结构的相似表征和器官与背景的区分度低造成冗余信息过多的问题,提出了一种基于高斯偏置自注意力和交叉注意力的网络模型(Gaussian bias and Contextual cross Attention U-Net,GCA-UNet)。采用残差模块建立空间先验假设,通过高斯偏置自注意力&外注意力模块的高斯偏置自注意力来学习空间先验假设和强化相邻区域的特征表示,并利用外注意力机制学习同一样本下不同切片之间的相关性;上下文交叉注意力门控利用多尺度特征提取来强化结构和边界信息,同时对上下文语义信息进行重新校准并筛除冗余信息。实验结果表明,在Synapse腹腔CT多器官分割数据集和ACDC心脏MRI数据集上,GCA-UNet网络的分割精度指标Mean Dice分别达到了81.37%和91.69%,在Synapse数据集上边界分割精度指标Mean hd95达到16.01。相比其他先进医学影像分割模型,GCA-Unet分割精度更高,具有更清晰的组织边界。

关键词: 医学图像分割, U型网络, 高斯偏置, 外注意力机制, 上下文交叉注意力门控

Abstract: To address the problems in medical image segmentation,such as varying target sizes,diverse representations of the same anatomical structures across slices,and low distinction between organs and background leading to excessive redundant information,a novel model based on Gaussian bias self-attention and contextual cross attention,named Gaussian bias and cross attention U-Net(GCA-UNet),is proposed.The model utilizes residual modules to establish spatial prior hypotheses,employs Gaussian bias self-attention and external attention mechanisms to learn spatial priors and enhance feature representations of adjacent areas,and uses external attention to understand inter-sample correlations.The cross attention gated mechanism leverages multi-scale feature extraction to reinforce structural and boundary information while recalibrating contextual semantic information and filtering out redundant data.Experimental results on the Synapse abdominal CT multi-organ segmentation dataset and ACDC cardiac MRI dataset show that,the proposed GCA-UNet achieves Mean Dice accuracy metrics of 81.37% and 91.69%,respectively,with a Mean hd95 boundary precision of 16.01 on the Synapse dataset.Compared to other advanced medical image segmentation mo-dels,GCA-UNet offers higher segmentation accuracy with clearer tissue boundaries.

Key words: Medical image segmentation, U-shape network, Gaussian bias, External attention, Contextual cross attention gate

中图分类号: 

  • TP391
[1]LONG J,SHELHAMER E,DARRELLT.Fully convolutionalnetworks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[2]RONNEBERGER O,FISCHER P,BROXT.U-net:Convolu-tional networks for biomedical image segmentation[C]//Medical image computing and computer-assisted intervention-MICCAI 2015:18th international conference,Munich,Germany,October 5-9,2015,proceedings,part III 18.Springer International Publishing,2015:234-241.
[3]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll you Need[C]//Proceedings of the 31st International Conference on Neural Information Processing System.Long Beach,USA:Curran Associates Inc.:6000-6010.
[4]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An Image is Worth 16x16 Words:Transformers for Image Recognition at Scale[J].arXiv:2010.11929,2010.
[5]WANG H,XIE S,LIN L,et al.Mixed transformer u-net formedical image segmentation[C]//2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2022).IEEE,2022:2390-2394.
[6]HO J,KALCHBRENNER N,WEISSENBORM D,et al.Axial Attention in Multidim-ensional Transformers[EB/OL].[2023-08-27].https://arxiv.org/pdf/1912.12180.pdf.
[7]OKTAY O,SCHLEMPER L L,FOLGOC M C,et al.Attention U-Net:Learning Where to Look for the Pancreas [EB/OL].[2023-08-27].https://arxiv.org/pdf/1804.03999.pdf.
[8]ÇIÇEK Ö,ABDULKADIR A,LIENKAMPS S,et al.3D U-Net:learning dense volumetric segmentation from sparse annotation[C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2016:19th International Conference,Athens,Greece,October 17-21,2016,Proceedings,Part II 19.Springer International Publishing,2016:424-432.
[9]XIAO X,LIAN S,LUO Z,et al.Weighted res-unet for high-quality retina vessel segmentation[C]//2018 9th International Conference on Information Technology in Medicine and Education(ITME).IEEE,2018:327-331.
[10]ZHOU Z,RAHMAN SIDDIQUEE M M,TAJBAKHSH N,et al.Unet++:A nested u-net architecture for medical image segmentation[C]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support:4th International Workshop,DLMIA 2018,and 8th International Workshop,ML-CDS 2018,Held in Conjunction with MICCAI 2018,Granada,Spain,September 20,2018,Proceedings 4.Springer International Publishing,2018:3-11.
[11]HUANG H,LIN L,TONG R,et al.Unet 3+:A full-scale connected unet for medical image segmentation[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).IEEE,2020:1055-1059.
[12]JI S Y,XIAO Z Y.Integrated context and multi-scale features in thoracic organs segmentation[J] Journal of Image and Graphics,26(9):2135-2145.
[13]MA J L,OU Y K,MAZ P,et al.Multiscale adaptive fusion network based algorithm for liver tumor detection[J].Journal of Image and Graphics,2023,28(1):260-276.
[14]WAN X J,ZHOU Y Y,SHEN M F,et al.Multi-scale context information fusion for instance segmentation[J].Journal of Image and Graphics,2023,28(2):495-509.
[15]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].[2023-08-27].https://arxiv.org/pdf/1511.07122.pdf.
[16]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[17]YANG H,BAI Z Y.CoT-Trans UNet CoT-TransUNet:Light-weight context Transformer medical image Segmentation network[J].Computer Engineering and Applications,2023,59(3):218-225.
[18]AGHDAM E K,AZAD R,ZARVANIM,et al.Attention swinu-net:Cross-contextual attention mechanism for skin lesion segmentation[C]//2023 IEEE 20th International Symposium on Biomedical Imaging(ISBI).IEEE,2023:1-5.
[19]HUANG Z,WANG X,HUANG L,et al.Ccnet:Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:603-612.
[20]CHEN J,LU Y,YU Q,et al.TransUNet:Transformers Make Strong Encoders for Medical Image Segmentation [EB/OL].(2023-08-27).https://arxiv.org/pdf/2102.04306.pdf.
[21]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[22]CAO H,WANG Y,CHEN J,et al.Swin-unet:Unet-like puretransformer for medical image segmentation[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:205-218.
[23]VALANARASU J M J,OZA P,HACIHALILOGLU I,et al.Medical transformer:Gated axial-attention for medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021:24th International Conference,Strasbourg,France,September 27-October 1,2021,proceedings,part I 24.Springer International Publishing,2021:36-46.
[24]GUO M H,LIU Z N,MU T J,et al.Beyond self-attention:External attention using two linear layers for visual tasks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(5):5436-5447.
[25]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[26]GUO M,ZHANG Y,LIU T.Gaussian transformer:a light-weight approach for natural language inference[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6489-6496.
[27]MILLETARI F,NAVAB N,AHMADI S A.V-net:Fully convolutional neural networks for volumetric medical image segmentation[C]//2016 Fourth International Conference on 3D Vision(3DV).IEEE,2016:565-571.
[28]FU S,LU Y,WANG Y,et al.Domain adaptive relational reasoning for 3d multi-organ segmentation[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2020:23rd International Conference,Lima,Peru,October 4-8,2020,Proceedings,Part I 23.Springer International Publishing,2020:656-666.
[29]XIE J,ZHU R,WU Z,et al.FFUNet:A novel feature fusionmakes strong decoder for medical image segmentation[J].IET signal processing,2022,16(5):501-514.
[30]HATAMIZADEH A,TANG Y,NATH V,et al.Unetr:Transformers for 3d medical image segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2022:574-584.
[31]HATAMIZADEH A,NATH V,TANG Y,et al.Swin unetr:Swin transformers for semantic segmentation of brain tumors in mri images[C]//International MICCAI brainlesion workshop.Cham:Springer International Publishing,2021:272-284.
[32]AZAD R,HEIDARI M,SHARIATNIAM,et al.Transdeeplab:Convolution-free transformer-based deeplab v3+ for medical image segmentation[C]//International Workshop on PRedictive Intelligence In MEdicine.Cham:Springer Nature Switzerland,2022:91-102.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!