Computer Science ›› 2024, Vol. 51 ›› Issue (2): 135-141.doi: 10.11896/jsjkx.221100260

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Medical Image Segmentation Algorithm Based on Self-attention and Multi-scale Input-Output

DING Tianshu, CHEN Yuanyuan   

  1. College of Computer Science,Sichuan University,Chengdu 610065,China
  • Received:2022-11-29 Revised:2023-04-06 Online:2024-02-15 Published:2024-02-22
  • About author:DING Tianshu,born in 1998,postgra-duate.His main research interests include artificial intelligence and medical image analysis.CHEN Yuanyuan,born in 1983,Ph.D,associate professor,master supervisor.Her main research interests include artificial intelligence,the theory and applications of neural networks and medical image analysis.

Abstract: Refined fundus image segmentation results of diabetic retinopathy can better assist doctors in diagnosis.The appea-rance of large scale and high resolution segmentation data sets provides favorable conditions for more refined segmentation.The mainstream segmentation network based on U-Net,using convolution operation based on local operation,cannot fully excavate global information when making pixel prediction.The network model adopts single-input single-output structure,which makes it difficult to obtain multi-scale feature information.In order to maximize the use of existing large-scale high-resolution fundus image focal segmentation data sets and achieve more refined segmentation,better segmentation methods need to be designed.In this paper,U-Net is transformed based on the self-attention mechanism and multi-scale input/output structure,and a new segmentation network,SAM-Net,is proposed.The self-attention module is used to replace the traditional convolutional module,and the ability of the network to obtain global information is increased.The multi-scale input and multi-scale output structures are introduced to make it easier for the network to obtain multi-scale feature information.The image slicing method is used to reduce the input size of the model,so as to prevent the training difficulty of the neural network model from increasing due to the large pixel of the input picture.Finally,experimental results on IDRiD and FGADR data sets show that SAM-Net can achieve better performance than other methods.

Key words: U-Net, Self-attention, Diabetic retinopathy, Segmentation, Neural network

CLC Number: 

  • TP389.1
[1]GULSHAN V,PENG L,CORAM M,et al.Development andvalidation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J].Jama,2016,316(22):2402-2410.
[2]HANEDA S,YAMASHITA H.International clinical diabeticretinopathy disease severity scale[J].Nippon Rinsho,Japanese Journal of Clinical Medicine,2010,68(Suppl 9):228-235.
[3]IRVIN J,RAJPURKAR P,KO M,et al.Chexpert:A large chest radiograph dataset with uncertainty labels and expert comparison[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:590-597.
[4]YAN K,WANG X,LU L,et al.Deeplesion:automated mining of large-scale lesion annotations and universal lesion detection with deep learning[J].Journal of Medical Imaging,2018,5(3):036501.
[5]FAN D,ZHOU T,JI G,et al.Inf-net:Automatic covid-19 lung infection segmentation from ct images[J].IEEE Transactions on Medical Imaging,2020,39(8):2626-2637.
[6]ASIRI N,HUSSAIN M,ADELF,et al.Deep learning basedcomputer-aided diagnosis systems for diabetic retinopathy:A survey[J].Artificial Intelligence in Medicine,2018,8:41-57.
[7]TU Z,GAO S,ZHOU K,et al.Sunet:A lesion regularized model for simultaneous diabetic retinopathy and diabetic macular edema grading[C]//2020 IEEE 17th International Symposium on Biomedical Imaging(ISBI).IEEE,2020:1378-1382.
[8]ARCADU F,BENMANSOUR F,MAUNZ A,et al.Deep lear-ning algorithm predicts diabetic retinopathy progression in individual patients[J].NPJ Digital Medicine,2019,2(1):1-9.
[9]GARGEYA R,LENG T.Automated identification of diabeticretinopathy using deep learning[J].Ophthalmology,2017,124(7):962-969.
[10]SEOUD L,CHELBI J,CHERIET F.Automatic grading of diabetic retinopathy on a public database[C]//MICCAI.Springer,2015.
[11]JIANG H,YANG K,GAO M,et al.An interpretable ensemble deep learning model for diabetic retinopathy disease classification[C]//2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC).IEEE,2019:2045-2048.
[12]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//MICCAI.Springer,2015:234-241.
[13] Kaggle diabetic retinopathy detection competition[EB/OL].https://www.kaggle.com/c/diabetic-retinopathy-detection.
[14]DECENCIERE E,ZHANG X,CAZUGUEL G,et al.Feedback on a publicly distributed image database:the messidor database[J].Image Analysis & Stereology,2014,33(3):231-234.
[15]PORWAL P,PACHADE S,KAMBLE R,et al.Indian diabetic retinopathy image dataset(idrid):A database for diabetic retinopathy screening research[J].Data,2018,3(3):25.
[16]STAAL J,ABRAMOFF M,NIEMEIJER M,et al.Ridge-based vessel segmentation in color images of the retina[J].IEEE Transactions on Medical Imaging,2004,23(4):501-509.
[17] International competition on ocular disease intelligent recogni-tion[OL].https://odir2019.grand-challenge.org.
[18] Kaggle aptos 2019 blindness detection competition[OL].https://www.kaggle.com/c/aptos2019-blindness-detection.
[19]PORWAL P,PACHADE S,KOKARE M,et al.Idrid:Diabetic retinopathy-segmentation and grading challenge[J].Medical Image Analysis,2020,59:101561.
[20] ZHOU Y,WANG B,HUANG L,et al.A benchmark for stu-dying diabetic retinopathy:Segmentation,grading,and transferability[J].IEEE Transactions on Medical Imaging,2020,40(3):818-828.
[21]ZHANG W,ZHONG J,YANG S,et al.Automated identifica-tion and grading system of diabetic retinopathy using deep neural networks[J].Knowl.Based Syst.,2019(175):12-25.
[22]YANG Y,LI T,LI W,et al.Lesion detection and grading of diabetic retinopathy via two-stages deep convolutional neural networks[C]//ICCAI.Springer,2017:533-540.
[23]WANG Z,YIN Y,SHI J,et al.Zoom-in-net:Deep mining lesions for diabetic retinopathy detection[C]//MICCAI.2017:267-275.
[24]LIN Z,GUO R,WANG Y,et al.A framework for identifying diabetic retinopathy based on antinoise detection and attention-based fusion[C]//MICCAI.Springer,2018:74-82.
[25]ZHOU Y,HE X,HUANG L,et al.Collaborative learning ofsemi-supervised segmentation and classification formedical images[C]//CVPR.2019.
[26]WU Y,GAO S,MEI J,et al.Jcs:An explainable covid-19 diagnosis system by joint classification and segmentation[J].arXiv:2004.07054,2020.
[27]CHAURASIA A,CULURCIELLO E.LinkNet:Exploiting encoder representations for efficient semantic segmentation[C]//2017 IEEE Visual Communications and Image Processing(VCIP).2017:1-4.
[28]WU Y,WU J,JIN S,et al.Dense-U-net:Dense encoder-decoder network for holographic imaging of 3D particle fields[J].Optics Communications,2021,493:126970.
[29]IBTEHAZ N,RAHMAN M.MultiResUNet:Rethinking the U-Net architecture for multimodal biomedical image segmentation[J].Neural Networks,2020,121:74-87.
[30]ZHOU Z,SIDDIQUEE M,TAJBAKHSH N,et al.UNet++:Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation[J].IEEE Trans Med Imaging,2019,39(6):1856-1867.
[31]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J/OL].Advances in Neural Information Processing Systems,2017,30.https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
[32]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).Minneapolis,Minnesota:Association for Computational Linguistics,2019:4171-4186.
[33]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].International Conference on Learning Representations.2021.
[34]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[J].arXiv:2103.14030,2021.
[35]CAO H,WANG Y,CHEN J,et al.Swin-Unet:Unet-like Pure Transformer for Medical Image Segmentation[J].arXiv:2105.05537,2021.
[36]LI G,YU Y.Visual saliency detection based on multiscale deep CNN features[J].IEEE Trans.Image Process.,2016,25(11):5012-5024.
[37]FU H,CHENG J,XU Y,et al.Joint Optic Disc and Cup Segmentation Based on Multi-Label Deep Network and Polar Transformation 2016[J].IEEE Transactions on Medical Imaging,2018,37(7):1597-1605.
[1] LUO Zeyang, TIAN Hua, DOU Yingtong, LI Manwen, ZHANG Zehua. Fake Review Detection Based on Residual Networks Fusion of Multi-relationship Review Features [J]. Computer Science, 2024, 51(4): 314-323.
[2] ZHAO Miao, XIE Liang, LIN Wenjing, XU Haijiao. Deep Reinforcement Learning Portfolio Model Based on Dynamic Selectors [J]. Computer Science, 2024, 51(4): 344-352.
[3] ZHANG Liying, SUN Haihang, SUN Yufa , SHI Bingbo. Review of Node Classification Methods Based on Graph Convolutional Neural Networks [J]. Computer Science, 2024, 51(4): 95-105.
[4] ZHANG Tao, LIAO Bin, YU Jiong, LI Ming, SUN Ruina. Benchmarking and Analysis for Graph Neural Network Node Classification Task [J]. Computer Science, 2024, 51(4): 132-150.
[5] WANG Ruiping, WU Shihong, ZHANG Meihang, WANG Xiaoping. Review of Vision-based Neural Network 3D Dynamic Gesture Recognition Methods [J]. Computer Science, 2024, 51(4): 193-208.
[6] YAN Wenjie, YIN Yiying. Human Action Recognition Algorithm Based on Adaptive Shifted Graph Convolutional Neural
Network with 3D Skeleton Similarity
[J]. Computer Science, 2024, 51(4): 236-242.
[7] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[8] HUANG Kun, SUN Weiwei. Traffic Speed Forecasting Algorithm Based on Missing Data [J]. Computer Science, 2024, 51(3): 72-80.
[9] CHEN Wei, ZHOU Lihua, WANG Yafeng, WANG Lizhen, CHEN Hongmei. Community Search Based on Disentangled Graph Neural Network in Heterogeneous Information Networks [J]. Computer Science, 2024, 51(3): 90-101.
[10] ZHANG Guohao, WANG Yi, ZHOU Xi, WANG Baoquan. Deep Collaborative Truth Discovery Based on Variational Multi-hop Graph Attention Encoder [J]. Computer Science, 2024, 51(3): 109-117.
[11] XU Bangwu, WU Qin, ZHOU Haojie. Appearance Fusion Based Motion-aware Architecture for Moving Object Segmentation [J]. Computer Science, 2024, 51(3): 155-164.
[12] LI Yu, YANG Xiangli, ZHANG Le, LIANG Yalin, GAO Xian, YANG Jianxi. Combined Road Segmentation and Contour Extraction for Remote Sensing Images Based on Cascaded U-Net [J]. Computer Science, 2024, 51(3): 174-182.
[13] ZHENG Cheng, SHI Jingwei, WEI Suhua, CHENG Jiaming. Dual Feature Adaptive Fusion Network Based on Dependency Type Pruning for Aspect-basedSentiment Analysis [J]. Computer Science, 2024, 51(3): 205-213.
[14] FENG Ren, CHEN Yunhua, XIONG Zhimin, CHEN Pinghua. Self-calibrating First Spike Temporal Encoding Neuron Model [J]. Computer Science, 2024, 51(3): 244-250.
[15] WANG Wenmiao. Prediction of Lower Limb Joint Angle Based on VMD-ELMAN Electromyographic Signals [J]. Computer Science, 2024, 51(3): 257-264.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!