Computer Science ›› 2024, Vol. 51 ›› Issue (2): 107-116.doi: 10.11896/jsjkx.230900002

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Image Segmentation Based on Deep Learning:A Survey

HUANG Wenke, TENG Fei, WANG Zidan, FENG Li   

  1. School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
  • Received:2023-09-01 Revised:2023-11-15 Online:2024-02-15 Published:2024-02-22
  • About author:HUANG Wenke,born in 2000,postgra-duate,is a member of CCF(No.N2093G).Her main research interests include image segmentation and so on.TENG Fei,born in 1984,professor.Her main research interests include medical informatics,cloud computing,medical big data analysis and so on.

Abstract: Image segmentation is a fundamental task in computer vision and its main purpose is to extract meaningful and cohe-rent regions from the image input.Over the years,a wide variety oftechniques have been developed in the field of image segmentation,including those based on traditional methods,as well as more recent image segmentation techniques utilizing convolutional neural networks.With the development of deep learning,more deep learning algorithms have been applied to image segmentation tasks.In particular,there has been a surge of scholarly interest in deep learning over the past two years,and many deep learning algorithms have emerged for image segmentation tasks.However,most of the new algorithms have not been summarized or analyzed,which will hinder the progress of subsequent research.This paper provides a comprehensive review of literatures on deep learning-based image segmentation research published in the past two years.First,it briefly introduces common datasets for image segmentation.Next,it clarifies new classifications for image segmentation based on deep learning.Finally,the existing challenges are discussed and the future research directions are prospected.

Key words: Image segmentation, Semantic segmentation, Deep learning, Network structure, Supervised learning

CLC Number: 

  • TP391
[1]TAGHANAKI S A,ABHISHEK K,COHEN J P,et al.Deep Semantic Segmentation of Natural and Medical Images:A Review[J].Artificial Intelligence Review,2021,54:137-178.
[2]ALJABRIMANAR,ALGHAMID M.A review on the use ofdeep learning for medical images segmentation[J].Neurocomputing,2022,506(28):311-335.
[3]MOORTHYJ,GANDHI U D.A Survey on Medical Image Segmentation Based on Deep Learning Techniques[J].Big Data and Cognitive Computing,2022,6(4):117.
[4]BENNAIM T,GUESSOUM Z,MAZOUZI S,et al.Multi-agent medical image segmentation:A survey[J].Computer Methods and Programs in Biomedicine,2023,232:107444.
[5]LIN T,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common Objects in Context[M].Cham:Springer International Publishing,2014:740-755.
[6]CAO L Y,LI J W.Research Progress on Pancreatic MedicalImage Segmentation Methods Based on Deep Learning[J].Journal of Chinese Computer Systems,2022,43(12):2591-2604.
[7]CORDTS M,OMRAN M,RAMOS S,et al.The CityscapesDataset for Semantic Urban Scene Understanding[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:3213-3223.
[8]ZHOU B,ZHAO H,PUIG X,et al.Semantic Understanding of Scenes Through the ADE20K Dataset[J].International Journal of Computer Vision,2016,127:302-321.
[9]MOTTAGHI R,CHEN X,LIU X,et al.The Role of Context for Object Detection and Semantic Segmentation in the Wild[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:891-898.
[10]YU F,CHEN H,WANG X,et al.BDD100K:A Diverse Driving Dataset for Heterogeneous Multitask Learning[C]//2020IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2018:2633-2642.
[11]BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:A high-definition ground truth database[J].Pattern Recognition Letters,2009,30:88-97.
[12]EVERINGHAM M,ESLAMI S M A,VAN GOOL L,et al.The Pascal Visual Object Classes Challenge:A Retrospective[J].International Journal of Computer Vision,2014,111:98-136.
[13]DAI A,CHANG A X,SAVVA M,et al.ScanNet:Richly-Annotated 3D Reconstructions of Indoor Scenes[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:2432-2443.
[14]XU N,YANG L,FAN Y,et al.YouTube-VOS:Sequence-to-Sequence Video Object Segmentation[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018.
[15]PERAZZI F,PONT-TUSET J,MCWILLIAMS B,et al.ABenchmark Dataset and Evaluation Methodology for Video Object Segmentation[C]//2016IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:724-732.
[16]WAH C,BRANSON S,WELINDER P,et al.The Caltech-UCSD Birds-200-2011 Dataset[J/OL].Computation & Neural Systems Technical Report,2011.https://www.vision.caltech.edu/datasets/cub_200_2011/.
[17]BEHLEY J,GARBADE M,MILIOTO A,et al.SemanticKITTI:A Dataset for Semantic Scene Understanding of LiDAR Sequences[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:9296-9306.
[18]GUPTA A,DOLLÁV P,GIRSHICK R.LVIS:A Dataset for Large Vocabulary Instance Segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:5351-5359.
[19]LIANG X,GONG K,SHEN X,et al.Look into Person:Joint Body Parsing & Pose Estimation Network and a New Benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41:871-885.
[20]ROTHER C,KOLMOGOROV V,BLAKE A.GrabCut:interactive foreground extraction using iterated graph cuts[J].ACM SIGGRAPH 2004 Papers,2004,23(3):309-314.
[21]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Advances in Neural Information Processing Systems,2012,25(2):1097-1105.
[22]KHALIFA A F,BADR E.Deep Learning for Image Segmentation:A Focus on Medical Imaging[J].Computers,Materials & Continua,2023,75(1):1995-2024.
[23]LIN H J.Video instance segmentation with a propose-reduceparadigm[J].arXiv:2103.13764,2021.
[24]HE H,HUANG Z,DING Y,et al.CDNet:Centripetal Direction Network for Nuclear Instance Segmentation[C]//International Conference on Computer Vision.IEEE,2021:4006-4015.
[25]HUANG S,MA Z,MU T,et al.Supervoxel Convolution for On-line 3D Semantic Segmentation[J].ACM Transactions on Graphics,2021,40(3):1-15.
[26]XU G,WU X,ZHANG X,et al.LeViT-UNet:Make Faster Encoders with Transformer for Medical Image Segmentation[J].arXiv:2107.08623,2021.
[27]CAO J,LENG H,LISCHINSKI D,et al.ShapeConv:Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation[J].arXiv:2108.10528,2021.
[28]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative Adversarial Networks[J].Communications of the ACM,2020,63(11):139-144.
[29]HE X Z,WANDT B,RHODIN H.GANSeg:Learning to Seg-ment by Unsupervised Hierarchical Image Generation[J].ar-Xiv:2112.01036,2021.
[30]KIM D,HONG B.Unsupervised Segmentation incorporatingShape Prior via Generative Adversarial Networks[C]//International Conference on Computer Vision.IEEE,2021:7304-7314.
[31]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706.03762,2017.
[32]STRUDELR.Segmenter:Transformer for semantic segmenta-tion[J].arXiv:2105.05633,2021.
[33]XIE E Z.SegFormer:Simple and efficient design for semantic segmentation with transformers[J].arXiv:2105.05633,2021.
[34]ZHAO T,ZHANG N,NING X,et al.CodedVTR:Codebook-based Sparse Voxel Transformer with Geometric Guidance[J].arXiv:2203.09887,2022.
[35]LIU H,MIAO X,MERTZ C,et al.CrackFormer:Transformer Network for Fine-Grained Crack Detection[C]//International Conference on Computer Vision.IEEE,2021:3763-3772.
[36]PU M,HUANG Y,LIU Y,et al.EDTER:Edge Detection with Transformer[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1392-1402.
[37]DE BRUIJNE M,CATTIN P C,COTIN S,et al.Multi-com-pound Transformer for Accurate Biomedical Image Segmentation[M].Switzerland:Springer International Publishing AG,2021:326-336.
[38]LIU H Y,LI C Z,LIU X T,et al.Neural Recognition of Dashed Curves with Gestalt Law of Continuity[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1363-1372.
[39]DING Y,YU X,YANG Y.RFNet:Region-aware Fusion Net-work for Incomplete Multi-modal Brain Tumor Segmentation[C]//International Conference on Computer Vision.IEEE,2021:3955-3964.
[40]LUO X,HU M,SONG T,et al.Semi-Supervised Medical Image Segmentation via Cross Teaching between CNN and Transfor-mer[J].arXiv:2112.04894,2021.
[41]CHENG B,MISRA I,SCHWING A G,et al.Masked-attention Mask Transformer for Universal Image Segmentation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:1280-1289.
[42]LI Z,WANG W,XIE E,et al.Panoptic SegFormer:Delving Deeper into Panoptic Segmentation with Transformers[J].arXiv:2109.03814,2021.
[43]HUANG X,DENG Z,LI D,et al.MISSFormer:An EffectiveTransformer for 2D Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(5):1484-1494.
[44]PAN J,BI Q,YANG Y,et al.Label-efficient Hybrid-supervised Learning for Medical Image Segmentation[C]//The Thirty-Sixth AAAI Conference on Artificial Intelligence(AAAI-22).2022.
[45]MAO B,ZHANG X,WANG L,et al.Learning from the Target:Dual Prototype Network for Few Shot Semantic Segmentation[J].Proceedings of the AAAI Conference on Artificial Intelligence,2022,36(2):1953-1961.
[46]SEIBOLD C M,REIß S,KLEESIEK J,et al.Reference-Guided Pseudo-Label Generation for Medical Semantic Segmentation[J].Proceedings of the AAAI Conference on Artificial Intelligence,2022,36(2):2171-2179.
[47]SONG Y,YU L,LEI B,et al.Data Discernment for Affordable Training in Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(5):1431-1445.
[48]MONKAM P,JIN S,LU W.Annotation Cost Minimization for Ultrasound Image Segmentation using Cross-domain Transfer Learning[J].IEEE Journal of Biomedical and Health Informa-tics,2023(4):1-11.
[49]CHAITANYA K,ERDIL E,KARANI N,et al.Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation[J].Medical Image Analysis,2023,87:102792.
[50]GARNOT V S F,LANDRIEU L.Panoptic Segmentation of Sa-tellite Image Time Series with Convolutional Temporal Attention Networks[J].arXiv:2107.07933,2021.
[51]FENG Z,WANG Z,WANG X,et al.Mutual-ComplementingFramework for Nuclei Detection and Segmentation in Pathology Image[C]//International Conference on Computer Vision.IEEE,2021:4016-4025.
[52]QIU L T,XIONG Z Y,WANG X H,et al.ETHSeg:An Amodel Instance Segmentation Network and a Real-world Dataset for X-Ray Waste Inspection[C]//IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition(CVPR).2022.
[53]CHENG B W,PARKHI O,KIRILLOV A.Pointly-Supervised Instance Segmentation[J].arXiv:2104.06404,2021.
[54]MARIN D,BOYKOV Y.Robust Trust Region for Weakly Supervised Segmentation[J].arXiv:2104.01948,2021.
[55]WANG Y,WANG H,SHEN Y,et al.Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels[J].arXiv:2203.03884,2022.
[56]ZHUANG J,WANG Z,GAO Y.Semi-Supervised Video Semantic Segmentation With Inter-Frame Feature Reconstruction[C]//Semi-Supervised Video Semantic Segmentation With Inter-Frame Feature Reconstruction.2022.
[57]LEI T,ZHANG D,DU X,et al.Semi-Supervised Medical Image Segmentation Using Adversarial Consistency Learning and Dynamic Convolution Network[J].IEEE Transactions on Medical Imaging,2023,42(5):1265-1277.
[58]ZHANG C,PAN T,LI Y,et al.MosaicOS:A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:407-417.
[59]XU Q,YAO L,JIANG Z,et al.DIRL:Domain-Invariant Representation Learning for Generalizable Semantic Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2884-2892.
[60]YAO H,HU X,LI X.Enhancing Pseudo Label Quality forSemi-SupervisedDomain-Generalized Medical Image Segmentation[J].arXiv:2201.08657,2022.
[61]WU X,WU Z,LU Y,et al.Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation[J].arXiv:2112.04665,2021.
[62]WOOD E,BALTRUŠAITIS T,HEWITT C,et al.Fake it tillyou make it:Face analysis in the wild using synthetic data alone[J].arXiv:2109.15102,2021.
[63]TANG H,LIU X,SUN S,et al.Recurrent Mask Refinement for Few-Shot Medical Image Segmentation[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:3898-3908.
[64]XU Q.A Fourier-based framework for domain generalization[J].arXiv:2108.00622,2021.
[65]ZHAO Z,ZHOU F,XU K,et al.LE-UDA:Label-Efficient Unsupervised Domain Adaptation for Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(3):633-646.
[66]WANG J,ZHONG C,FENG C,et al.Disentangled Representation for Cross-Domain Medical Image Segmentation[J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-15.
[67]WANG H,CHU H,FU S,et al.Renovate Yourself:Calibrating Feature Representation of Misclassified Pixels for Semantic Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2450-2458.
[68]SU Y,SUN R,LIN G,et al.Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation[J].arXiv:2103.01795,2021.
[69]RAJU A,MIAO S,CHENG C T,et al.Deep Implicit Statistical Shape Models for 3D Medical Image Delineation[J].arXiv:2104.02847,2021.
[70]DING Y,YU X,YANG Y.RFNet:Region-aware Fusion Net-work for Incomplete Multi-modal Brain Tumor Segmentation[C]//International Conference on Computer Vision.IEEE,2021:3955-3964.
[71]SUN S Y,YUE X Y,QI X J,et al.Aggregation with Feature Detection[C]//IEEE/CVF International Conference on Computer Vision(ICCV).2021.
[72]BORSE S,PARK H,CAI H,et al.Panoptic,Instance and Semantic Relations:A Relational Context Encoder to Enhance Panoptic Segmentation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1259-1269.
[73]PAN W W,SHI H N,ZHAO Z,et al.Wnet:Audio-Guided Vi-deo Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022.
[74]ZHENG S,TAN J,JIANG C,et al.Automated multi-modalTransformer network(AMTNet) for 3D medical images segmentation[J].Physics in Medicine & Biology,2023,68(2):25014.
[75]ZHANG W,HU M,TAN Q,et al.Cross-modal attention guided visual reasoning for referring image segmentation[J].Multimedia Tools and Applications,2023,82(19):28853-28872.
[76]ZHANG S,ZHANG J,TIAN B,et al.Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation[J].Medical Image Analysis,2023,83:102656.
[77]CHEN Z Z,WANG T,WU X W,et al.Class Re-ActivationMaps for Weakly-Supervised Semantic Segmentation[J].arXiv:2203.00962,2022.
[78]CHEN X,ZHAO Z Y,ZHUANG Y L,et al.FocalClick:Towards Practical Interactive Image Segmentation[J].arXiv:2204.02574,2022.
[79]ZHANG Y,PANG B,LU C.Semantic Segmentation by Early Region Proxy[J].arXiv:2203.14043,2022.
[80]ZHOU Y,ZHANG H,LEE H,et al.Slot-VPS:Object-centricRepresentation Learning for Video Panoptic Segmentation[J].arXiv:2112.08949,2021.
[81]YANG S S,WANG X G,LI Y,et al.Temporally Efficient Vision Transformer for Video Instance Segmentation[J].arXiv:2204.08412,2022.
[82]WANG L,CHAE Y,YOON K J.Dual Transfer Learning forEvent-based End-task Prediction via Pluggable Event to Image Translation[J].arXiv:2109.01801,2021.
[83]SASMAL B,DHAL K G.A survey on the utilization of Superpixel image for clustering based image segmentation[J].Multimedia Tools and Applications,2023,82:35493-35555.
[84]ZHANG B,WANG X,LIU L,et al.CeLNet:a correlation-enhanced lightweight network for medical image segmentation[J].Physics in Medicine & Biology,2023,68(11):115012.
[85]PENG Y,YU D,GUO Y.MShNet:Multi-scale feature combined with h-network for medical image segmentation[J].Biomedical Signal Processing and Control,2023,79(Part2):104167.
[86]QIN J,WU J,YAN P,et al.FreeSeg:Unified,Universal and Open-Vocabulary Image Segmentation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2023:19446-19455.
[87]WANG L,LI D,ZHU Y,et al.Cross-Dataset CollaborativeLearning for Semantic Segmentation[J].arXiv:2103.11351,2021.
[88]ISENSEE F,JAEGER P F,KOHL S A A,et al.nnU-Net:a self-configuring method for deep learning-based biomedical image segmentation[J].Nature Methods,2020,18:203-211.
[89]GAONY,HE F,JIA J,et al.PanopticDepth:A Unified Framework for Depth-aware Panoptic Segmentation[J].arXiv:2206.00468,2022.
[90]THYAGHARAJAN A,UMMENHOFER B,LADDHA P.Segment-Fusion:Hierarchical Context Fusion for Robust 3D Semantic Segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022.
[91]DAO L,LY N Q.A Comprehensive Study on Medical Image Segmentation using Deep Neural Networks[J].International Journal of Advanced Computer Science & Applications,2023,14(3):167-184.
[1] KANG Wei, LI Lihui, WEN Yimin. Semi-supervised Classification of Data Stream with Concept Drift Based on Clustering Model Reuse [J]. Computer Science, 2024, 51(4): 124-131.
[2] CHEN Runhuan, DAI Hua, ZHENG Guineng, LI Hui , YANG Geng. Urban Electricity Load Forecasting Method Based on Discrepancy Compensation and Short-termSampling Contrastive Loss [J]. Computer Science, 2024, 51(4): 158-164.
[3] LIN Binwei, YU Zhiyong, HUANG Fangwan, GUO Xianwei. Data Completion and Prediction of Street Parking Spaces Based on Transformer [J]. Computer Science, 2024, 51(4): 165-173.
[4] SONG Hao, MAO Kuanmin, ZHU Zhou. Algorithm of Stereo Matching Based on GAANET [J]. Computer Science, 2024, 51(4): 229-235.
[5] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[6] CHEN Jinyin, LI Xiao, JIN Haibo, CHEN Ruoxi, ZHENG Haibin, LI Hu. CheatKD:Knowledge Distillation Backdoor Attack Method Based on Poisoned Neuronal Assimilation [J]. Computer Science, 2024, 51(3): 351-359.
[7] HUANG Kun, SUN Weiwei. Traffic Speed Forecasting Algorithm Based on Missing Data [J]. Computer Science, 2024, 51(3): 72-80.
[8] ZHENG Cheng, SHI Jingwei, WEI Suhua, CHENG Jiaming. Dual Feature Adaptive Fusion Network Based on Dependency Type Pruning for Aspect-basedSentiment Analysis [J]. Computer Science, 2024, 51(3): 205-213.
[9] CAI Jiacheng, DONG Fangmin, SUN Shuifa, TANG Yongheng. Unsupervised Learning of Monocular Depth Estimation:A Survey [J]. Computer Science, 2024, 51(2): 117-134.
[10] ZHANG Feng, HUANG Shixin, HUA Qiang, DONG Chunru. Novel Image Classification Model Based on Depth-wise Convolution Neural Network andVisual Transformer [J]. Computer Science, 2024, 51(2): 196-204.
[11] DAI Wei, CHAI Jing, LIU Yajiao. Semi-supervised Learning Algorithm Based on Maximum Margin and Manifold Hypothesis [J]. Computer Science, 2024, 51(2): 259-267.
[12] WANG Yangmin, HU Chengyu, YAN Xuesong, ZENG Deze. Study on Deep Reinforcement Learning for Energy-aware Virtual Machine Scheduling [J]. Computer Science, 2024, 51(2): 293-299.
[13] HUANG Changxi, ZHAO Chengxin, JIANG Xiaoteng, LING Hefei, LIU Hui. Screen-shooting Resilient DCT Domain Watermarking Method Based on Deep Learning [J]. Computer Science, 2024, 51(2): 343-351.
[14] GE Huibin, WANG Dexin, ZHENG Tao, ZHANG Ting, XIONG Deyi. Study on Model Migration of Natural Language Processing for Domestic Deep Learning Platform [J]. Computer Science, 2024, 51(1): 50-59.
[15] JING Yeyiran, YU Zeng, SHI Yunxiao, LI Tianrui. Review of Unsupervised Domain Adaptive Person Re-identification Based on Pseudo-labels [J]. Computer Science, 2024, 51(1): 72-83.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!