计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 230200119-8.doi: 10.11896/jsjkx.230200119

• 交叉&应用 • 上一篇    下一篇

一种融合CNN和Swin Transformer的医学显微图像分割模型

孙开鑫, 刘斌, 苏曙光   

  1. 华中科技大学软件学院 武汉 430074
  • 发布日期:2023-11-09
  • 通讯作者: 苏曙光(sueagle@163.com)
  • 作者简介:(sunkaixin_1009@163.com)
  • 基金资助:
    武汉市科技计划基金(2019010701011385)

Medical Microscopic Image Segmentation Model Based on CNN Structure and Swin Transformer

SUN Kaixin, LIU Bin, SU Shuguang   

  1. School of Software Engineer,Huazhong University of Science and Technology,Wuhan 430074,China
  • Published:2023-11-09
  • About author:SUN Kaixin,born in 1998,postgraduate.His main research interests include deep learning and image processing.
    SU Shuguang,born in 1975,Ph.D,assistant professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include machine learning,image processing and embedded system.
  • Supported by:
    Wuhan Science and Technology Research Program(2019010701011385).

摘要: 医学显微图像分割在临床诊断和病理分析中具有重要应用价值。然而,由于显微图像具有形状、纹理、大小等复杂的视觉特征,因此要精确分割显微图像是一项困难的任务。文中提出了一种新的分割模型UMSTC,该模型基于U型结构,并通过将U-net模型和Swin Transformer模型进行融合来兼顾图像的细节特征和宏观特征,并保持建模完整性。具体来说,UMSTC模型的下采样部分采用Swin Transformer网络来优化其内含的注意力机制,以提取微观和宏观特征;上采样部分基于CNN网络反卷积操作,并通过残差机制接收和融合下采样阶段的特征图,以减小图像合成精度损失。实验结果表明,所提出的UMSTC分割模型比目前主流的医学图像语义分割模型具有更好的分割效果,其中mPA提高了约3%~5%,mIoU提高了约3%~8%,且分割结果具有更高的主观视觉质量和更少的噪点。因此,UMSTC模型在医学显微图像分割领域具有广泛的应用前景。

关键词: 显微图像分割, Swin Transformer, 卷积神经网络, 注意力机制, 残差网络

Abstract: Medical microscopic image segmentation has important application value in clinical diagnosis and pathological analysis.However,due to the complex visual features such as shape,texture,and size of microscopic images,accurate segmentation of these images is a challenging task.In this paper,we propose a new segmentation model called UMSTC,which is based on a U-shaped structure and combines the U-Net model and Swin Transformer model to balance the details and macro features of images while maintaining modeling integrity.Specifically,the down-sampling part of the UMSTC model uses the Swin Transformer network to optimize its inherent attention mechanism for extracting micro and macro features,while the up-sampling part is based on a CNN network's deconvolution operation and uses a residual mechanism to receive and fuse feature maps from the down-sampling stage to reduce image synthesis accuracy loss.Experimental results show that the proposed UMSTC segmentation model has better segmentation performance than current mainstream medical image semantic segmentation models,with mPA and mIoU increases by approximately 3%~ 5% and 3%~8%,respectively,and the segmentation results have higher subjective visual quality and fewer artifacts.Therefore,the UMSTC model has broad application prospects in the field of medical microscopic image segmentation.

Key words: Microscopic image segmentation, Swin Transformer, CNN, Attention mechanism, Residual network

中图分类号: 

  • TP391.1
[1]WANG F L.Experimental Study on Detecting Diffuse Axonal Injury with FTIR Mapping[D].WuHan:Huazhong University of Science & Technology,2018.
[2]LI S X.Study of diffusion tensor imaging and immunohistochemistry on diffuse axonal injury[D].WuHan:Huazhong University of Science & Technology,2012.
[3]LEI T,ZHOU W,ZHANG Y,et al.Lightweight v-net for liver segmentation[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).IEEE,2020:1379-1383.
[4]VALANARASU J M J,PATEL V M.UNeXt:MLP-based Rapid Medical Image Segmentation Network[J].arXiv:2203.04967,2022.
[5]CHEN P H C,GADEPALLI K,MACDONALD R,et al.Microscope 2.0:an augmented reality microscope with real-time artificial intelligence integration[J].arXiv:1812.00825,2018.
[6]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[7]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[8]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[9]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2015:3431-3440.
[10]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE transactions on pattern analysis and machine intelligence,2017,40(4):834-848.
[11]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:2881-2890.
[12]RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolu-tional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015.2015:234-241.
[13]ZHOU Z,SIDDIQUEE M M R,TAJBAKHSH N,et al.A Nested U-Net Architecture for Medical Image Segmentation[J].arXiv:1807.10165,2018.
[14]DIAKOGIANNIS F I,WALDNER F,CACCETTA P,et al.ResU-Net-a:A deep learning framework for semantic segmentation of remotely sensed data[J].ISPRS Journal of Photogrammetry and Remote Sensing,2020,162(1):94-114.
[15]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in neural information processing systems,2017,30.
[16]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[17]CHEN J,LU Y,YU Q,et al.TransUnet:Transformers make strong encoders for medical image segmentation[J].arXiv:2102.04306,2021.
[18]ZHANG Y,LIU H,HU Q.Transfuse:Fusing transformers and cnns for medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021.2021:14-24.
[19]LIU Z,LIN Y,CAO Y,et al.Swin Transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!