Computer Science ›› 2024, Vol. 51 ›› Issue (5): 100-107.doi: 10.11896/jsjkx.230400114

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Medical Image Segmentation Network Integrating Full-scale Feature Fusion and RNN with Attention

SHAN Xinxin, LI Kai, WEN Ying   

  1. Shanghai Key Laboratory of Multidimensional Information Processing,School of Communication and Electronic Engineering,East China Normal University,Shanghai 200241,China
  • Received:2023-04-16 Revised:2023-08-16 Online:2024-05-15 Published:2024-05-08
  • About author:SHAN Xinxin,born in 1996,Ph.D.Her main research interests include compu-ter vision and image processing.
    WEN Ying,born in 1975,professor,Ph.D supervisor,is a member of CCF(No.F2169M).Her main research interests include computer vision,image processing and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(62273150),Shanghai Natural Science Foundation(22ZR1421000),Shanghai Outstanding Academic Leaders Plan(21XD1430600) and Science and Technology Commission of Shanghai Municipality(22DZ2229004).

Abstract: The encoder-decoder network in deep learning has excellent performance in image feature extraction and hierarchical feature fusion,and is often used in medical image segmentation.However,the current mainstream encoding and decoding network segmentation methods still face two problems:1)in encoding and decoding stages,image feature information mined by a single network may be insufficient;2)encoder-decoder networks using simple skip connections cannot fully exploit the contextual information of full-scale features.Therefore,aiming at the shortcomings of the existing methods,an encoder-decoder network integrating full-scale feature fusion and RNN with attention for medical image segmentation is proposed.At first,the convolutional multi-layer perceptron(MLP) module combined with MLP is introduced in U-Net encoder to further expand the feature receptive field of the encoder.Secondly,by the full-scale feature fusion module,the skip connection features of each scale are effectively fused with coarse-grained information and fine-grained information.This operation reduces the semantic difference between the skip-connection features of each scale and highlights the key feature information of the image.Finally,the decoder refines the image feature information level by level through the proposed recurrent attention decoding module(RADU) combining recurrent neural network(RNN) and attention mechanism,which strengthens feature extraction while avoiding information redundancy,and obtains the final segmentation results.The proposed method is compared with the mainstream algorithms on BrainWeb,MRbrainS,HVSMR and Choledoch datasets,the image segmentation precision is improved in pixel accuracy and dice similarity coefficient.Therefore,experimental results show that by introducing the full-scale feature fusion module and the proposed RADU,the proposed method can achieve excellent segmentation results in image segmentation applications and has good noise robustness and anti-interference ability.

Key words: Medical image segmentation, Encoder-Decoder network, Multi-layer perceptron, Full-scale feature fusion, Attention mechanism, Recurrent neural network

CLC Number: 

  • TP391.7
[1]MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.1967:281-297.
[2]DUNN J C.A fuzzy relative of the ISODATA process and itsuse in detecting compact well-separated clusters [J].Journal of Cybernetics,1973,3(3):32-57.
[3]GONG M,LIANG Y,SHI J,et al.Fuzzy c-means clusteringwith local information and kernel metric for image segmentation [J].IEEE Transactions on Image Processing,2012,22(2):573-584.
[4]ASHISH V,NOAM S,NIKI P,et al.Attention is all you need[C]//Proceedings of 31st Conference on Neural Information Processing Systems.2017:6000-6010.
[5]SHELHAMER E,LONG J,DARRELL T.Fully convolutionalnetworks for semantic segmentation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(4):640-651.
[6]RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolu-tional networks for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.2015:234-241.
[7]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Seg-Net:A deep convolutional encoder-decoder architecture for image segmentation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(12):2481-2495.
[8]OKTAY O,SCHLEMPER J,FOLGOC L,et al.Attention U-Net:Learning where to look for the pancreas[C]//Conference on Medical Imaging with Deep Learning.2018:1-10.
[9]CHENG F,CHEN C,WANG Y,et al.Learning directional feature maps for cardiac MRI segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2020:108-117.
[10]YE Z,WU M.Choroidal Neovascularization Segmentation Com-bining Temporal Supervision and Attention Mechanism [J].Computer Science,2021,48(8):118-124.
[11]BAI X,MA Y,WANG W.Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J].Computer Science,2023,50(3):199-207.
[12]CHEN J,LU Y,YU Q,et al.TransUNet:Transformers makestrong encoders for medical image segmentation[C]//Procee-dings of the IEEE International Conference on Computer Vision.2021:1-13.
[13]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16×16 words:Transformers for image recognition at scale[C]//International Conference on Learning Representations.2021:1-22.
[14]SHAN X,MA T,GU A,et al.TCRNet:Make Transformer,CNN and RNN complement each other[C]//Proceedings of International Conference on Acoustics,Speech and Signal Proces-sing.2022:1441-1445.
[15]JIN Y,HAN D,KO H.TrSeg:Transformer for semantic segmentation [J].Pattern Recognition Letters,2021,148:29-35.
[16]TOLSTIKHIN I O,HOULSBY N,KOLESNIKOV A,et al.Mlp-mixer:An all-mlp architecture for vision[C]//Proceedings of Neural Information Processing Systems.2021:1-16.
[17]LI J,HASSANI A,WALTON S,et al.ConvMLP:Hierarchical convolutional MLPs for vision [J].arXiv:2109.04454,2021.
[18]VALANARASU J M J,PATEL M V.UNeXt:MLP-based rapid medical image segmentation network[C]//Proceedings of International Conference on Medical Image Computing and Compu-ter-Assisted Intervention.2022:23-33.
[19]VALANARASU J M J,OZA P,HACIHALILOGLU I,et al.Medical Transformer:Gated axial-attention for medical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.2021:36-46.
[20]WANG H,XIE S,LIN L,et al.Mixed Transformer U-Net for medical image segmentation[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.2022:2390-2394.
[21]ZHOU P,GONG S,ZHONG S,et al.Image Semantic Segmentation Based on Deep Feature Fusion [J].Computer Science,2020,47(2):126-134.
[22]WANG H,CAO P,WANG J,et al.UCTransNet:Rethinkingthe skip connections in u-net from a channel-wise perspective with Transformer[C]//Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence.Vancouver.2022:7966-7978.
[23]WEN Y,XIE K,HE L.Segmenting medical MRI via recurrent decoding cell[C]//Proceedings of The Thirty-Forth AAAI Conference on Artificial Intelligence.2020:12452-12459.
[24]COCOSCO C A,KOLLOKIAN V,KWAN K S,et al.Brainweb:Online interface to a 3D MRI simulated brain database [J].NeuroImage,1997,5(4):part 2/4,S425.
[25]MENDRIK A M,VINCKEN K L,KUIJF H J,et al.MRBrainS challenge:online evaluation framework for brain image segmentation in 3T MRI scans [J].Computational Intelligence and Neuroscience,2015(1):1-16.
[26]PACE D F,DALCA A V,GEVA T,et al.Interactive whole-heart segmentation in congenital heart disease[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.2015:80-88.
[27]ZHANG Q,LI Q,YU G,et al.A Multidimensional Choledoch Database and Benchmarks for Cholangiocarcinoma Diagnosis [J].IEEE Access,2019,7:149414-149421.
[28]HUANG H,LIN L,TONG R,et al.UNet 3+:A full-scale connected UNet for medical image segmentation[C]//Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing.2020:1055-1059.
[29]LI X,YOU A,ZHU Z,et al.Semantic flow for fast and accurate scene parsing[C]//Proceedings of European Conference on Computer Vision.2020:775-793.
[30]WOO S,PARK J,LEE J,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19.
[31]SUDRE C H,LI W Q,VERCAUTEREN T,et al.Generalised dice overlap as a deep learning loss function for highly unba-lanced segmentations[C]//Proceedings of the 3rd MICCAI International Workshop on Deep Learning in Medical Image Ana-lysis and Multimodal Learning for Clinical Decision Support.2017:240-248.
[32]GU A,SHAN X,WEN Y.An image segmentation model with integrated dissimilarity criterion and entropy rate super-pixel [J].Journal of Image and Graphics,2022,27(11):3267-3279.
[33]AL-DMOUR H,AL-ANI A.A clustering fusion technique for MR brain tissue segmentation [J].Neurocomputing,2018,275:546559.
[34]HASTIE T,TIBSHIRANI R.Discriminant adaptive nearestneighbor classification [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1996,18(6):607-616.
[1] HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan. Cross-modal Information Filtering-based Networks for Visual Question Answering [J]. Computer Science, 2024, 51(5): 85-91.
[2] ZHOU Yu, CHEN Zhihua, SHENG Bin, LIANG Lei. Multi Scale Progressive Transformer for Image Dehazing [J]. Computer Science, 2024, 51(5): 117-124.
[3] BAI Xuefei, SHEN Wucheng, WANG Wenjian. Salient Object Detection Based on Feature Attention Purification [J]. Computer Science, 2024, 51(5): 125-133.
[4] LAN Yongqi, HE Xingxing, LI Yingfang, LI Tianrui. New Graph Reduction Representation and Graph Neural Network Model for Premise Selection [J]. Computer Science, 2024, 51(5): 193-199.
[5] WANG Ruiping, WU Shihong, ZHANG Meihang, WANG Xiaoping. Review of Vision-based Neural Network 3D Dynamic Gesture Recognition Methods [J]. Computer Science, 2024, 51(4): 193-208.
[6] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[7] ZHANG Mingdao, ZHOU Xin, WU Xiaohong, QING Linbo, HE Xiaohai. Unified Fake News Detection Based on Semantic Expansion and HDGCN [J]. Computer Science, 2024, 51(4): 299-306.
[8] WANG Zihong, SHAO Yingxia, HE Jiyuan, LIU Jinbao. Sequential Recommendation Based on Multi-space Attribute Information Fusion [J]. Computer Science, 2024, 51(3): 102-108.
[9] HAO Ran, WANG Hongjun, LI Tianrui. Deep Neural Network Model for Transmission Line Defect Detection Based on Dual-branch Sequential Mixed Attention [J]. Computer Science, 2024, 51(3): 135-140.
[10] LI Yu, YANG Xiangli, ZHANG Le, LIANG Yalin, GAO Xian, YANG Jianxi. Combined Road Segmentation and Contour Extraction for Remote Sensing Images Based on Cascaded U-Net [J]. Computer Science, 2024, 51(3): 174-182.
[11] LIAO Meng, JIA Zhen, LI Tianrui. Chinese Named Entity Recognition Based on Label Information Fusion and Multi-task Learning [J]. Computer Science, 2024, 51(3): 198-204.
[12] SUN Shounan, WANG Jingbin, WU Renfei, YOU Changkai, KE Xifan, HUANG Hao. TMGAT:Graph Attention Network with Type Matching Constraint [J]. Computer Science, 2024, 51(3): 235-243.
[13] LIU Xuheng, BAI Zhengyao, XU Zhu, DU Jiajin, XIAO Xiao. Multi-guided Point Cloud Registration Network Combined with Attention Mechanism [J]. Computer Science, 2024, 51(2): 142-150.
[14] ZHANG Guodong, CHEN Zhihua, SHENG Bin. Infrared Small Target Detection Based on Dilated Convolutional Conditional GenerativeAdversarial Networks [J]. Computer Science, 2024, 51(2): 151-160.
[15] ZHANG Feng, HUANG Shixin, HUA Qiang, DONG Chunru. Novel Image Classification Model Based on Depth-wise Convolution Neural Network andVisual Transformer [J]. Computer Science, 2024, 51(2): 196-204.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!