Computer Science ›› 2019, Vol. 46 ›› Issue (9): 36-46.doi: 10.11896/j.issn.1002-137X.2019.09.005

• Surverys • Previous Articles     Next Articles

Research on Image Semantic Segmentation for Complex Environments

WANG Yan-ran1, CHEN Qing-liang1 , WU Jun-jun2   

  1. (College of Information Science and Technology,Jinan University,Guangzhou 510632,China)1;
    (School of Mechatronics Engineering,Foshan University,Foshan,Guangdong 528225,China)2
  • Received:2019-02-17 Online:2019-09-15 Published:2019-09-02

Abstract: Image semantic segmentation is one of the most important fundamental technologies for visual intelligence.Semantic segmentation can greatly enable intelligent systems to understand their surrounding scenarios,so it has enormous value in application domains such as unmanned vehicles,robot cognition and navigation,video surveillance and drone landing systems.Great challenges also exist in the semantic segmentation of images,due to various interfering factors of targets in complex environments,such as unstructured targets,diversity of objectives,irregular shapes,illumination changes,different viewing angles,scale variation,object occlusion,etc.In recent years,benefiting from the great advancements in deep learning techniques,a large number of research approaches with practical significance emerge in ima-ge semantic segmentation.For having a comprehensive survey and inspiring the academic research,this paper extensively discussed the existing state-of-the-art image semantic segmentation methods,and further classified them into the traditional image semantic segmentation ones,the ones combining traditional and deep learning techniques,and those based purely on deep learning.In order to address these problems in complex environments,various semantic segmentation methods for complex environment emerged in recent years were analyzed and compared in detail,including the mo-dels,algorithms and performance with the category of strong supervised,weak supervised and unsupervised semantic segmentation methods.Furthermore,the current main datasets such as PASCAL VOC,Cityscape,SUN RGB-D,which contains various complex environments and 3 evaluation indicators of PA,mPA,mIoU were summarized.Finally,the existing research of image semantic segmentation for complex environment was summarized,and its future trends were prospected such as optimization in real-time video,3d scene reconstruction and unsupervised semantic segmentation techniques.

Key words: Convolutional neural network, Deep learning, Image segmentation, Semantic segmentation, Visual intelligence

CLC Number: 

  • TP391
[1]GÓMEZ D,YÁÑEZ J,GUADA C,et al.Fuzzy image segmentation based upon hierarchical clustering[J].Knowledge-Based Systems,2015,87(7):26-37.
[2]NAZ S,MAJEED H,IRSHAD H.Image segmentation usingfuzzy clustering:A survey[C]//International Conference on Emerging Technologies.Islamabad:IEEE,2010:181-186.
[3]PENG B,ZHANG L,ZHANG D.A survey of graph theoretical approaches to image segmentation[J].Pattern Recognition,2013,46(3):1020-1038.
[4]LIU S T,YIN F L.The Basic Principle and Its New Advances ofImage Segmentation Methods Based on Graph Cuts[J].Acta Automatica Sinica,2012,38(6):911-922.(in Chinese)刘松涛,殷福亮.基于图割的图像分割方法及其新进展[J].自动化学报,2012,38(6):911-922.
[5]YI F,MOON I.Image segmentation:A survey of graph-cutmethods[C]//International Conference on Systems and Informatics.Yantai:IEEE,2012:1936-1941.
[6]JIANG F,GU Q,HAO H Z,et al.Survey on Content-Based Image Segmentation Methods[J].Journal of Software,2017,28(1):160-183.(in Chinese)姜枫,顾庆,郝慧珍,等.基于内容的图像分割方法综述[J].软件学报,2017,28(1):160-183.
[7]GARCIA-GARCIA A,ORTS-ESCOLANO S,OPREA S,et al.A Review on Deep Learning Techniques Applied to Semantic Segmentation[J].arXiv:1704.06857,2017.
[8]SMEULDERS A W M,WORRING M,SANTINI S,et al.Content-Based Image Retrieval at the End of the Early Years[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2000,22(12):1349-1380.
[9]DESAI A D,GOLD G E,HARGREAVES B A,et al.Technical Considerations for Semantic Segmentation in MRI using Convolutional Neural Networks[J].arXiv preprint arXiv:1902.01977,2019.
[10]MARDIA K V,HAINSWORTH T J.A Spatial ThresholdingMethod for Image Segmentation[J].IEEE transactions on pattern analysis and machine intelligence,1988,10(6):919-927.
[11]LAKSHMI S,SANKARANARAYANAN D V.A study of edge detection techniques for segmentation computing approaches[J].International Journal of Computer Applications,2010,CASCT(1):35-41.
[12]GIANNAKEAS N,KARVELIS P S,EXARCHOS T P,et al.Segmentation of microarray images using pixel classification-Comparison with clustering-based methods[J].Computers in biology and medicine,2013,43(6):705-716.
[13]ADAMS R,BISCHOF L.Seeded region growing[J].IEEETransactions on pattern analysis and machine intelligence,1994,16(6):641-647.
[14]LI S Z.Markov random field models in computer vision[C]//European conference on computer vision.Heidelberg:Springer,1994:361-370.
[15]LAFFERTY J D,MCCALLUM A,PEREIRA F C N.Condi-tional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]//International Conference on Machine Learning.Williamstown:Morgan Kaufmann,2001:282-289.
[16]SHI J,MALIK J.Normalized Cuts and Image Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
[17]ROTHER C,KOLMOGOROV V,BLAKE A."GrabCut":interactive foreground extraction using iterated graph cuts[J].ACM Transactions on Graphics,2004,23(3):309-314.
[18]HENZINGER M,NOE A,SCHULZ C,et al.Practical Minimum Cut Algorithms[J].ACM Journal of Experimental Algorithmics,2018,23(1):1-8.
[19]XU H X,TIAN Z,DING M T.Multiscale Segmentation forSAR Image Based on Spectral Clustering and Mixture Model[J].Journal of Image and Graphics,2010,15(3):450-454.(in Chinese)徐海霞,田铮,丁明涛.基于谱聚类与混合模型的SAR图像多尺度分割[J].中国图象图形学报,2010,15(3):450-454.
[20]LIU L,SHI Z G,SU H R,et al.Image Segmentation Based on Higher Order Markov Random Field[J].Journal of Computer Research and Development,2013,50(9):1933-1942.(in Chinese)刘磊,石志国,宿浩茹.基于高阶马尔可夫随机场的图像分割[J].计算机研究与发展,2013,50(9):1933-1942.
[21]ARBELAEZ P,MAIRE M,FOWLKES C C,et al.Contour Detection and Hierarchical Image Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(5):898-916.
[22]VINCENT L,SOILLE P.Watersheds in Digital Spaces:An Efficient Algorithm Based on Immersion Simulations[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13(6):583-598.
[23]ZHANG C,XUE Z,ZHU X,et al.Boosted random contextualsemantic space based representation for visual recognition[J].Information Sciences,2016,369(6):160-170.
[24]PONT-TUSET J,ARBELAEZ P,BARRON J T,et al.Multis-cale Combinatorial Grouping for Image Segmentation and Object Proposal Generation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(1):128-140.
[25]FARABET C,COUPRIE C,NAJMAN L,et al.Learning Hierarchical Features for Scene Labeling[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8):1915-1929.
[26]GHIASI G,FOWLKES C C.Laplacian pyramid reconstructionand refinement for semantic segmentation[C]//European Conference on Computer Vision.Amsterdam:Springer,2016:519-534.
[27]FAVREAU J D,LAFARGE F,BOUSSEAU A,et al.Extrac-ting Geometric Structures in Images with Delaunay Point Processes[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.IEEE,2019:1-1.
[28]COUPRIE C,FARABET C,NAJMAN L,et al.Indoor Semantic Segmentation using depth information[J].arXiv preprint arXiv:1301.3572,2013.
[29]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[30]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems.Nevada:ACM,2012:1097-1105.
[31]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International journal of computer vision,2015,115(3):211-252.
[32]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv preprint arXiv:1409.1556,2014.
[33]LIU Y,YU J,HAN Y.Understanding the effective receptivefield in semantic image segmentation[J].Multimedia Tools and Applications,2018,77(17):22159-22171.
[34]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:1-9.
[35]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Im-age Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778.
[36]LONG J,SHELHAMER E,DARRELL T.Fully Convolutional Networks for Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Massachusetts:IEEE,2015:3431-3440.
[37]BADRINARAYANAN V,KENDALL A,Cipolla R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].arXiv preprint arXiv:1511.00561,2015.
[38]CHEN L-C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic Image Segmentation with Deep Convolutional Nets,Atrous Convolution,and Fully Connected CRFs[J].IEEE transactions on pattern analysis and machine intelligence,2017,40(4):834-848.
[39]LIN G,MILAN A,SHEN C,et al.RefineNet:Multi-Path Re-finement Networks for High-Resolution Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:5168-5177.
[40]ZHAO H,SHI J,QI X,et al.Pyramid Scene Parsing Network[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:6230-6239.
[41]YU C,WANG J,PENG C,et al.BiSeNet:Bilateral Segmentation Network for Real-Time Semantic Segmentation[C]//European Conference on Computer Vision.Cham:Springer,2018:334-349.
[42]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[J].arXiv preprint arXiv:1802.02611,2018.
[43]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J].arXiv preprint arXiv:1412.7062,2014.
[44]ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional Random Fields as Recurrent Neural Networks[C]//IEEE International Conference on Computer Vision.Santiago:IEEE,2015:1529-1537.
[45]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//IEEE International Conference on Computer Vision.Santiago,Chile:IEEE,2015:1520-1528.
[46]HONG S,NOH H,HAN B.Decoupled Deep Neural Networkfor Semi-supervised Semantic Segmentation[C]//Neural Information Processing Systems.Montreal:IEEE,2015:1495-1503.
[47]PASZKE A,CHAURASIA A,KIM S,et al.Enet:A deep neural network architecture for real-time semantic segmentation[J].arXiv preprint arXiv:1606.02147,2016.
[48]YANG J,PRICE B,COHEN S,et al.Object contour detection with a fully convolutional encoder-decoder network[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:193-202.
[49]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking Atrous Convolution for Semantic Image Segmentation[J].arXiv preprint arXiv:1706.05587,2017.
[50]YU F,KOLTUN V.Multi-Scale Context Aggregation by Dilated Convolutions[J].arXiv:1511.07122,2015.
[51]ZHOU S,WU J N,WU Y,et al.Exploiting Local Structureswith the Kronecker Layer in Convolutional Networks[J].arXiv preprint arXiv:1512.09194,2015.
[52]WANG P,CHEN P,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//IEEE Winter Conference on Applications of Computer Vision.Nevada:IEEE,2018:1451-1460.
[53]LIN G,MILAN A,SHEN C,et al.Refinenet:Multi-path refinement networks for high-resolution semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:5168-5177.
[54]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:2881-2890.
[55]YU C,WANG J,PENG C,et al.Learning a Discriminative Feature Network for Semantic Segmentation[J].arXiv preprint arXiv:1804.09337,2018.
[56]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//European Conference on Computer Vision.Cham:Springer,2018:3-19.
[57]ZHANG H,DANA K,SHI J,et al.Context encoding for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:7151-7160.
[58]KIRILLOV A,GIRSHICK R,HE K,et al.Panoptic FeaturePyramid Networks[J].arXiv preprint arXiv:1901.02446,2019.
[59]WEI Y,LIANG X,CHEN Y,et al.Learning to segment withimage-level annotations[J].Pattern Recognition,2016,59(1):234-244.
[60]WEI Y,LIANG X,CHEN Y,et al.Stc:A simple to complex framework for weakly-supervised semantic segmentation[J].IEEE transactions on pattern analysis and machine intelligence,2017,39(11):2314-2320.
[61]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deepfeatures for discriminative localization[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:2921-2929.
[62]WEI Y,FENG J,LIANG X,et al.Object region mining with adversarial erasing:A simple classification to semantic segmentation approach[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:6488-6496.
[63]ZHANG X,WEI Y,FENG J,et al.Adversarial complementary learning for weakly supervised object localization[C]//IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:1325-1334.
[64]RICHTER S R,VINEET V,ROTH S,et al.Playing for data:Ground truth from computer games[C]//European Conference on Computer Vision.Amsterdam:Springer,2016:102-118.
[65]YAO T,PAN Y,NGO C W,et al.Semi-supervised domain adaptation with subspace learning for visual recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:2142-2150.
[66]SUN B,FENG J,SAENKO K.Return of frustratingly easy domain adaptation[C]//The Thirty-Second AAAI Conference on Artificial Intelligence.Arizona:ACM,2016:2058-2065.
[67]TZENG E,HOFFMAN J,ZHANG N,et al.Deep domain confusion:Maximizing for domain invariance[J].arXiv preprint arXiv:1412.3474,2014.
[68]TZENG E,HOFFMAN J,DARRELL T,et al.Simultaneousdeep transfer across domains and tasks[C]//IEEE International Conference on Computer Vision.Santiago:IEEE,2015:4068-4076.
[69]TZENG E,HOFFMAN J,SAENKO K,et al.Adversarial dis-criminative domain adaptation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE,2017:4.
[70]HOFFMAN J,WANG D,YU F,et al.Fcns in the wild:Pixel-level adversarial and constraint-based adaptation[J].arXiv preprint arXiv:1612.02649,2016.
[71]ZHANG Y,QIU Z,YAO T,et al.Fully Convolutional Adaptation Networks for Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:6810-6818.
[72]BROSTOW G J,SHOTTON J,FAUQUEUR J,et al.Segmentation and Recognition Using Structure from Motion Point Clouds[C]//European Conference on Computer Vision.Marseille:Springer,2008:44-57.
[73]BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:A high-definition ground truth database[J].Pattern Recognition Letters,2009,30(2):88-97.
[74]LIU C,YUEN J,TORRALBA A.Sift flow:Dense correspondence across scenes and its applications[J].IEEE transactions on pattern analysis and machine intelligence,2011,33(5):978-994.
[75]RUSSELL B C,TORRALBA A,MURPHY K P,et al.La-belMe:A Database and Web-Based Tool for Image Annotation[J].International Journal of Computer Vision,2008,77(1/2/3):157-173.
[76]EVERINGHAM M,ESLAMI S M A,GOOL L J V,et al.The Pascal Visual Object Classes Challenge:A Retrospective[J].International Journal of Computer Vision,2015,111(1):98-136.
[77]MOTTAGHI R,CHEN X,LIU X,et al.The Role of Context for Object Detection and Semantic Segmentation in the Wild[C]//IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:891-898.
[78]CORDTS M,OMRAN M,RAMOS S,et al.The CityscapesDataset for Semantic Urban Scene Understanding[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:3213-3223.
[79]ROS G,SELLART L,MATERZYNSKA J,et al.The SYNTHIA Dataset:A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:3234-3243.
[80]HERNANDEZ-JUAREZ D,SCHNEIDER L,ESPINOSA A,et al.Slanted Stixels:Representing San Francisco’s Steepest Streets[J].arXiv:1707.05397,2017.
[81]SILBERMAN N,HOIEM D,KOHLI P,et al.Indoor segmentation and support inference from rgbd images[C]//European Conference on Computer Vision.Florence:Springer,2012:746-760.
[82]XIAO J,OWENS A,TORRALBA A.Sun3d:A database of big spaces reconstructed using sfm and object labels[C]//IEEE International Conference on Computer Vision.Sydney,Australia:IEEE,2013:1625-1632.
[83]SONG S,LICHTENBERG S P,XIAO J.SUN RGB-D:A RGB-D scene understanding benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition.Massachusetts:IEEE,2015:567-576.
[84]JANOCH A,KARAYEV S,JIA Y,et al.A category-level 3-D object dataset:Putting the Kinect to work[C]//IEEE International Conference on Computer Vision.Barcelona:IEEE,2011:1168-1174.
[85]STURGESS P,ALAHARI K,LADICKY L,et al.Combiningappearance and structure from motion features for road scene understanding[C]//British Machine Vision Conference.London:British Machine Vision Association,2009:7-10.
[86]MARTIN D,FOWLKES C,TAL D,et al.A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics[C]//IEEE International Conference on Computer Vision.Vancouver:IEEE,2001:416-425.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[5] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[9] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[10] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[11] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[12] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[13] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[14] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[15] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!