基于深度学习的图像分割综述

doi:10.11896/jsjkx.230900002

摘要/Abstract

摘要： 图像分割是计算机视觉中的一项基本任务,其主要目的是从图像输入中提取有意义和连贯的区域。多年来,图像分割领域已经开发出了各种各样的技术,包括基于传统方法,以及利用卷积神经网络的最新图像分割技术。随着深度学习的发展,更多的深度学习算法也被应用到图像分割任务中。特别地,近两年学者对深度学习的兴趣高涨,涌现了许多应用于图像分割任务的深度学习算法。然而大部分新的算法还没有被归纳分析,这将不利于后续研究的进行。文中对近两年发表的基于深度学习的图像分割研究进行了全面回顾。首先对图像分割的常用数据集进行简要介绍,然后阐明了基于深度学习的图像分割的新分类,最后讨论了现有的挑战并对今后的研究方向进行了展望。

关键词: 图像分割, 语义分割, 深度学习, 网络结构, 监督学习

Abstract: Image segmentation is a fundamental task in computer vision and its main purpose is to extract meaningful and cohe-rent regions from the image input.Over the years,a wide variety oftechniques have been developed in the field of image segmentation,including those based on traditional methods,as well as more recent image segmentation techniques utilizing convolutional neural networks.With the development of deep learning,more deep learning algorithms have been applied to image segmentation tasks.In particular,there has been a surge of scholarly interest in deep learning over the past two years,and many deep learning algorithms have emerged for image segmentation tasks.However,most of the new algorithms have not been summarized or analyzed,which will hinder the progress of subsequent research.This paper provides a comprehensive review of literatures on deep learning-based image segmentation research published in the past two years.First,it briefly introduces common datasets for image segmentation.Next,it clarifies new classifications for image segmentation based on deep learning.Finally,the existing challenges are discussed and the future research directions are prospected.

Key words: Image segmentation, Semantic segmentation, Deep learning, Network structure, Supervised learning

中图分类号:

TP391

黄雯珂, 滕飞, 王子丹, 冯力. 基于深度学习的图像分割综述[J]. 计算机科学, 2024, 51(2): 107-116. https://doi.org/10.11896/jsjkx.230900002

HUANG Wenke, TENG Fei, WANG Zidan, FENG Li. Image Segmentation Based on Deep Learning:A Survey[J]. Computer Science, 2024, 51(2): 107-116. https://doi.org/10.11896/jsjkx.230900002

参考文献

[1]TAGHANAKI S A,ABHISHEK K,COHEN J P,et al.Deep Semantic Segmentation of Natural and Medical Images:A Review[J].Artificial Intelligence Review,2021,54:137-178.
[2]ALJABRIMANAR,ALGHAMID M.A review on the use ofdeep learning for medical images segmentation[J].Neurocomputing,2022,506(28):311-335.
[3]MOORTHYJ,GANDHI U D.A Survey on Medical Image Segmentation Based on Deep Learning Techniques[J].Big Data and Cognitive Computing,2022,6(4):117.
[4]BENNAIM T,GUESSOUM Z,MAZOUZI S,et al.Multi-agent medical image segmentation:A survey[J].Computer Methods and Programs in Biomedicine,2023,232:107444.
[5]LIN T,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common Objects in Context[M].Cham:Springer International Publishing,2014:740-755.
[6]CAO L Y,LI J W.Research Progress on Pancreatic MedicalImage Segmentation Methods Based on Deep Learning[J].Journal of Chinese Computer Systems,2022,43(12):2591-2604.
[7]CORDTS M,OMRAN M,RAMOS S,et al.The CityscapesDataset for Semantic Urban Scene Understanding[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:3213-3223.
[8]ZHOU B,ZHAO H,PUIG X,et al.Semantic Understanding of Scenes Through the ADE20K Dataset[J].International Journal of Computer Vision,2016,127:302-321.
[9]MOTTAGHI R,CHEN X,LIU X,et al.The Role of Context for Object Detection and Semantic Segmentation in the Wild[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:891-898.
[10]YU F,CHEN H,WANG X,et al.BDD100K:A Diverse Driving Dataset for Heterogeneous Multitask Learning[C]//2020IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2018:2633-2642.
[11]BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:A high-definition ground truth database[J].Pattern Recognition Letters,2009,30:88-97.
[12]EVERINGHAM M,ESLAMI S M A,VAN GOOL L,et al.The Pascal Visual Object Classes Challenge:A Retrospective[J].International Journal of Computer Vision,2014,111:98-136.
[13]DAI A,CHANG A X,SAVVA M,et al.ScanNet:Richly-Annotated 3D Reconstructions of Indoor Scenes[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:2432-2443.
[14]XU N,YANG L,FAN Y,et al.YouTube-VOS:Sequence-to-Sequence Video Object Segmentation[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018.
[15]PERAZZI F,PONT-TUSET J,MCWILLIAMS B,et al.ABenchmark Dataset and Evaluation Methodology for Video Object Segmentation[C]//2016IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:724-732.
[16]WAH C,BRANSON S,WELINDER P,et al.The Caltech-UCSD Birds-200-2011 Dataset[J/OL].Computation & Neural Systems Technical Report,2011.https://www.vision.caltech.edu/datasets/cub_200_2011/.
[17]BEHLEY J,GARBADE M,MILIOTO A,et al.SemanticKITTI:A Dataset for Semantic Scene Understanding of LiDAR Sequences[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:9296-9306.
[18]GUPTA A,DOLLÁV P,GIRSHICK R.LVIS:A Dataset for Large Vocabulary Instance Segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:5351-5359.
[19]LIANG X,GONG K,SHEN X,et al.Look into Person:Joint Body Parsing & Pose Estimation Network and a New Benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41:871-885.
[20]ROTHER C,KOLMOGOROV V,BLAKE A.GrabCut:interactive foreground extraction using iterated graph cuts[J].ACM SIGGRAPH 2004 Papers,2004,23(3):309-314.
[21]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Advances in Neural Information Processing Systems,2012,25(2):1097-1105.
[22]KHALIFA A F,BADR E.Deep Learning for Image Segmentation:A Focus on Medical Imaging[J].Computers,Materials & Continua,2023,75(1):1995-2024.
[23]LIN H J.Video instance segmentation with a propose-reduceparadigm[J].arXiv:2103.13764,2021.
[24]HE H,HUANG Z,DING Y,et al.CDNet:Centripetal Direction Network for Nuclear Instance Segmentation[C]//International Conference on Computer Vision.IEEE,2021:4006-4015.
[25]HUANG S,MA Z,MU T,et al.Supervoxel Convolution for On-line 3D Semantic Segmentation[J].ACM Transactions on Graphics,2021,40(3):1-15.
[26]XU G,WU X,ZHANG X,et al.LeViT-UNet:Make Faster Encoders with Transformer for Medical Image Segmentation[J].arXiv:2107.08623,2021.
[27]CAO J,LENG H,LISCHINSKI D,et al.ShapeConv:Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation[J].arXiv:2108.10528,2021.
[28]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative Adversarial Networks[J].Communications of the ACM,2020,63(11):139-144.
[29]HE X Z,WANDT B,RHODIN H.GANSeg:Learning to Seg-ment by Unsupervised Hierarchical Image Generation[J].ar-Xiv:2112.01036,2021.
[30]KIM D,HONG B.Unsupervised Segmentation incorporatingShape Prior via Generative Adversarial Networks[C]//International Conference on Computer Vision.IEEE,2021:7304-7314.
[31]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706.03762,2017.
[32]STRUDELR.Segmenter:Transformer for semantic segmenta-tion[J].arXiv:2105.05633,2021.
[33]XIE E Z.SegFormer:Simple and efficient design for semantic segmentation with transformers[J].arXiv:2105.05633,2021.
[34]ZHAO T,ZHANG N,NING X,et al.CodedVTR:Codebook-based Sparse Voxel Transformer with Geometric Guidance[J].arXiv:2203.09887,2022.
[35]LIU H,MIAO X,MERTZ C,et al.CrackFormer:Transformer Network for Fine-Grained Crack Detection[C]//International Conference on Computer Vision.IEEE,2021:3763-3772.
[36]PU M,HUANG Y,LIU Y,et al.EDTER:Edge Detection with Transformer[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1392-1402.
[37]DE BRUIJNE M,CATTIN P C,COTIN S,et al.Multi-com-pound Transformer for Accurate Biomedical Image Segmentation[M].Switzerland:Springer International Publishing AG,2021:326-336.
[38]LIU H Y,LI C Z,LIU X T,et al.Neural Recognition of Dashed Curves with Gestalt Law of Continuity[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1363-1372.
[39]DING Y,YU X,YANG Y.RFNet:Region-aware Fusion Net-work for Incomplete Multi-modal Brain Tumor Segmentation[C]//International Conference on Computer Vision.IEEE,2021:3955-3964.
[40]LUO X,HU M,SONG T,et al.Semi-Supervised Medical Image Segmentation via Cross Teaching between CNN and Transfor-mer[J].arXiv:2112.04894,2021.
[41]CHENG B,MISRA I,SCHWING A G,et al.Masked-attention Mask Transformer for Universal Image Segmentation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:1280-1289.
[42]LI Z,WANG W,XIE E,et al.Panoptic SegFormer:Delving Deeper into Panoptic Segmentation with Transformers[J].arXiv:2109.03814,2021.
[43]HUANG X,DENG Z,LI D,et al.MISSFormer:An EffectiveTransformer for 2D Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(5):1484-1494.
[44]PAN J,BI Q,YANG Y,et al.Label-efficient Hybrid-supervised Learning for Medical Image Segmentation[C]//The Thirty-Sixth AAAI Conference on Artificial Intelligence(AAAI-22).2022.
[45]MAO B,ZHANG X,WANG L,et al.Learning from the Target:Dual Prototype Network for Few Shot Semantic Segmentation[J].Proceedings of the AAAI Conference on Artificial Intelligence,2022,36(2):1953-1961.
[46]SEIBOLD C M,REIß S,KLEESIEK J,et al.Reference-Guided Pseudo-Label Generation for Medical Semantic Segmentation[J].Proceedings of the AAAI Conference on Artificial Intelligence,2022,36(2):2171-2179.
[47]SONG Y,YU L,LEI B,et al.Data Discernment for Affordable Training in Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(5):1431-1445.
[48]MONKAM P,JIN S,LU W.Annotation Cost Minimization for Ultrasound Image Segmentation using Cross-domain Transfer Learning[J].IEEE Journal of Biomedical and Health Informa-tics,2023(4):1-11.
[49]CHAITANYA K,ERDIL E,KARANI N,et al.Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation[J].Medical Image Analysis,2023,87:102792.
[50]GARNOT V S F,LANDRIEU L.Panoptic Segmentation of Sa-tellite Image Time Series with Convolutional Temporal Attention Networks[J].arXiv:2107.07933,2021.
[51]FENG Z,WANG Z,WANG X,et al.Mutual-ComplementingFramework for Nuclei Detection and Segmentation in Pathology Image[C]//International Conference on Computer Vision.IEEE,2021:4016-4025.
[52]QIU L T,XIONG Z Y,WANG X H,et al.ETHSeg:An Amodel Instance Segmentation Network and a Real-world Dataset for X-Ray Waste Inspection[C]//IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition(CVPR).2022.
[53]CHENG B W,PARKHI O,KIRILLOV A.Pointly-Supervised Instance Segmentation[J].arXiv:2104.06404,2021.
[54]MARIN D,BOYKOV Y.Robust Trust Region for Weakly Supervised Segmentation[J].arXiv:2104.01948,2021.
[55]WANG Y,WANG H,SHEN Y,et al.Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels[J].arXiv:2203.03884,2022.
[56]ZHUANG J,WANG Z,GAO Y.Semi-Supervised Video Semantic Segmentation With Inter-Frame Feature Reconstruction[C]//Semi-Supervised Video Semantic Segmentation With Inter-Frame Feature Reconstruction.2022.
[57]LEI T,ZHANG D,DU X,et al.Semi-Supervised Medical Image Segmentation Using Adversarial Consistency Learning and Dynamic Convolution Network[J].IEEE Transactions on Medical Imaging,2023,42(5):1265-1277.
[58]ZHANG C,PAN T,LI Y,et al.MosaicOS:A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:407-417.
[59]XU Q,YAO L,JIANG Z,et al.DIRL:Domain-Invariant Representation Learning for Generalizable Semantic Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2884-2892.
[60]YAO H,HU X,LI X.Enhancing Pseudo Label Quality forSemi-SupervisedDomain-Generalized Medical Image Segmentation[J].arXiv:2201.08657,2022.
[61]WU X,WU Z,LU Y,et al.Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation[J].arXiv:2112.04665,2021.
[62]WOOD E,BALTRUŠAITIS T,HEWITT C,et al.Fake it tillyou make it:Face analysis in the wild using synthetic data alone[J].arXiv:2109.15102,2021.
[63]TANG H,LIU X,SUN S,et al.Recurrent Mask Refinement for Few-Shot Medical Image Segmentation[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:3898-3908.
[64]XU Q.A Fourier-based framework for domain generalization[J].arXiv:2108.00622,2021.
[65]ZHAO Z,ZHOU F,XU K,et al.LE-UDA:Label-Efficient Unsupervised Domain Adaptation for Medical Image Segmentation[J].IEEE Transactions on Medical Imaging,2023,42(3):633-646.
[66]WANG J,ZHONG C,FENG C,et al.Disentangled Representation for Cross-Domain Medical Image Segmentation[J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-15.
[67]WANG H,CHU H,FU S,et al.Renovate Yourself:Calibrating Feature Representation of Misclassified Pixels for Semantic Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2450-2458.
[68]SU Y,SUN R,LIN G,et al.Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation[J].arXiv:2103.01795,2021.
[69]RAJU A,MIAO S,CHENG C T,et al.Deep Implicit Statistical Shape Models for 3D Medical Image Delineation[J].arXiv:2104.02847,2021.
[70]DING Y,YU X,YANG Y.RFNet:Region-aware Fusion Net-work for Incomplete Multi-modal Brain Tumor Segmentation[C]//International Conference on Computer Vision.IEEE,2021:3955-3964.
[71]SUN S Y,YUE X Y,QI X J,et al.Aggregation with Feature Detection[C]//IEEE/CVF International Conference on Computer Vision(ICCV).2021.
[72]BORSE S,PARK H,CAI H,et al.Panoptic,Instance and Semantic Relations:A Relational Context Encoder to Enhance Panoptic Segmentation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:1259-1269.
[73]PAN W W,SHI H N,ZHAO Z,et al.Wnet:Audio-Guided Vi-deo Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022.
[74]ZHENG S,TAN J,JIANG C,et al.Automated multi-modalTransformer network(AMTNet) for 3D medical images segmentation[J].Physics in Medicine & Biology,2023,68(2):25014.
[75]ZHANG W,HU M,TAN Q,et al.Cross-modal attention guided visual reasoning for referring image segmentation[J].Multimedia Tools and Applications,2023,82(19):28853-28872.
[76]ZHANG S,ZHANG J,TIAN B,et al.Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation[J].Medical Image Analysis,2023,83:102656.
[77]CHEN Z Z,WANG T,WU X W,et al.Class Re-ActivationMaps for Weakly-Supervised Semantic Segmentation[J].arXiv:2203.00962,2022.
[78]CHEN X,ZHAO Z Y,ZHUANG Y L,et al.FocalClick:Towards Practical Interactive Image Segmentation[J].arXiv:2204.02574,2022.
[79]ZHANG Y,PANG B,LU C.Semantic Segmentation by Early Region Proxy[J].arXiv:2203.14043,2022.
[80]ZHOU Y,ZHANG H,LEE H,et al.Slot-VPS:Object-centricRepresentation Learning for Video Panoptic Segmentation[J].arXiv:2112.08949,2021.
[81]YANG S S,WANG X G,LI Y,et al.Temporally Efficient Vision Transformer for Video Instance Segmentation[J].arXiv:2204.08412,2022.
[82]WANG L,CHAE Y,YOON K J.Dual Transfer Learning forEvent-based End-task Prediction via Pluggable Event to Image Translation[J].arXiv:2109.01801,2021.
[83]SASMAL B,DHAL K G.A survey on the utilization of Superpixel image for clustering based image segmentation[J].Multimedia Tools and Applications,2023,82:35493-35555.
[84]ZHANG B,WANG X,LIU L,et al.CeLNet:a correlation-enhanced lightweight network for medical image segmentation[J].Physics in Medicine & Biology,2023,68(11):115012.
[85]PENG Y,YU D,GUO Y.MShNet:Multi-scale feature combined with h-network for medical image segmentation[J].Biomedical Signal Processing and Control,2023,79(Part2):104167.
[86]QIN J,WU J,YAN P,et al.FreeSeg:Unified,Universal and Open-Vocabulary Image Segmentation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2023:19446-19455.
[87]WANG L,LI D,ZHU Y,et al.Cross-Dataset CollaborativeLearning for Semantic Segmentation[J].arXiv:2103.11351,2021.
[88]ISENSEE F,JAEGER P F,KOHL S A A,et al.nnU-Net:a self-configuring method for deep learning-based biomedical image segmentation[J].Nature Methods,2020,18:203-211.
[89]GAONY,HE F,JIA J,et al.PanopticDepth:A Unified Framework for Depth-aware Panoptic Segmentation[J].arXiv:2206.00468,2022.
[90]THYAGHARAJAN A,UMMENHOFER B,LADDHA P.Segment-Fusion:Hierarchical Context Fusion for Robust 3D Semantic Segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022.
[91]DAO L,LY N Q.A Comprehensive Study on Medical Image Segmentation using Deep Neural Networks[J].International Journal of Advanced Computer Science & Applications,2023,14(3):167-184.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed