基于深度特征融合的图像语义分割

doi:10.11896/jsjkx.190100119

Abstract

Abstract: When feature extraction is performed by using convolutional networks in image semantic segmentation,the context information is lost due to the reduced resolution of features by the repeated combination of maximum pooling and downsampling operations,so that the segmentation result loses the sensitivity to the object location.Although the network based on the encoder-decoder architecture gradually refines the output precision through the jump connection in the process of restoring the resolution,the operation of simply summing the adjacent features ignores the difference between the features and easily leads to local mis-identification of objects and other issues.To this end,an image semantic segmentation method based on deep feature fusion was proposed.It adopts a network structure in which multiple sets of fully convolutional VGG16 models are combined in parallel,processes multi-scale images in the pyramid in parallel efficiently with atrous convolutions,extracts multi-level context feature,and fuses layer by layer through a top-down method to capture the context information as far as possible.At the same time,the layer-by-layer label supervision strategy based on the improved loss function is an auxiliary support with a dense conditional random field of pixels modeling in the backend,which has certain optimization in terms of the difficulty of model training and the accuracy of predictive output.Experimental data show that the image semantic segmentation algorithm improves the classification of target objects and the location of spatial details by layer-by-layer fusion of deep features that characterize different scale context information.The experimental results obtained on PASCAL VOC 2012 and PASCAL CONTEXT datasets show that the proposed method achieves mIoU accuracy of 80.5% and 45.93%,respectively.The experimental data fully demonstrate that deep feature extraction,feature layer-by-layer fusion and layer-by-layer label supervision strategy in the parallel framework can jointly optimize the algorithm architecture.The feature comparison shows that the model can capture rich context information and obtain more detailed image semantic features.Compared with similar methods,it has obvious advantages.

Key words: Atrous convolution, Conditional random field, Context information, Deep feature, Feature fusion, Image semantic segmentation

CLC Number:

TP391

ZHOU Peng-cheng,GONG Sheng-rong,ZHONG Shan,BAO Zong-ming,DAI Xing-hua. Image Semantic Segmentation Based on Deep Feature Fusion[J].Computer Science, 2020, 47(2): 126-134.

References

[1]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE Press,2015:3431-3440.
[2]YU F,KOLTUM V.Multi-Scale Context Aggregation by Dila-tedConvolutions[C]∥Proceedings of International Conference on Learning Representations.Puerto Rico:IEEE Press,2016:397-410.
[3]WANG P,CHEN P,YUAN Y,et al.Understanding Convolution for Semantic Segmentation[C]∥Proceedings of IEEE Winter Conference on Applications of Computer Vision.Santa Rosa:IEEE Press,2017:1451-1460.
[4]LIU Z,LI X,LUO P,et al.Semantic Image Segmentation via Deep Parsing Network[C]∥Proceedings of IEEE International Conference on Computer Vision.Santiago Chile:IEEE Press,2015:1377-1385.
[5]NGUYEN K,FOOKES C,SRIDHARAN S.Deep Context Mo-deling for Semantic Segmentation[C]∥Proceedings of IEEE Winter Conference on Applications of Computer Vision.Santa Rosa,California,United States:IEEE Press,2017:56-63.
[6]GHIASI G,FOWLKES C C.Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation[C]∥Proceedings of European Conference on Computer Vision.Cham:Springer Press,2016:519-534.
[7]BERTASIUS G,TORRESANI L,YU S X,et al.Convolutional Random Walk Networks for Semantic Image Segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii:IEEE Press,2017:6137-6145.
[8]DAI J,HE K,SUN J.BoxSup:Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation[C]∥Proceedings of IEEE International Conference on Computer Vision.Santiago,Chile:IEEE Press,2015:1635-1643.
[9]WANG G,LUO P,LIN L,et al.Learning Object Interactions and Descriptions for Semantic Image Segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii,USA:IEEE Press,2017:5235-5243.
[10]MAURO D D,FURNARI A,PATANE G,et al.Scene Adaptation for Semantic Segmentation using Adversarial Learning[C]∥Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance.Auckland,New Zealand:IEEE Press,2018:1-6.
[11]ZHANG Y H,QIU Z F,YAO T,et al.Fully Convolutional Adaptation Networks for Semantic Segmentation[C]∥Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:6810-6818.
[12]TSAI Y H,HUNG W C,Schulter S,et al.Learning to Adapt Structured Output Space for Semantic Segmentation[C]∥Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:7472-7481.
[13]LENG J X,LIU Y,ZHANG T L,et al.Context-Aware U-Net for Biomedical Image Segmentation[C]∥Proceedings of IEEE International Conference on Bioinformatics and Biomedicine.Madrid,Spain:IEEE Press,2018:2535-2538.
[14]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Seg-Net:A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(12):2481-2495.
[15]BULO S R,NEUHOLD G,KONTSCHIEDER P.Loss Max-Pooling for Semantic Image Segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii,USA:IEEE Press,2017:7082-7091.
[16]LIN G,SHEN C,HENGEL A V,et al.Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,United States:IEEE Press,2016:3194-3203.
[17]LIN G,SHEN C,HENGEL A V,et al.Exploring Context with Deep Structured Models for Semantic Segmentation [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2018,40(6):1352-1366.
[18]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic Image Segmentation with Deep Convolutional Nets,Atrous Convolution,and Fully Connected CRFs [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,40(4):834-848.
[19]PHILIPPK,KOLTUN V.Parameter Learning and Convergent Inference for Dense Random Fields[C]∥Proceedings of International Conference on International Conference on Machine Learning.Atlanta,GA,USA:ACM Press,2013:513-521.
[20]ADAMS A,BAEK J,DAVIS M A.Fast High-Dimensional Filtering Using the Permutohedral Lattice[J].Computer Graphics Forum,2010,29(2):753-762.
[21]EVERINGHAM M,ESLAMI S M A,Van G L,et al.The PASCAL Visual Object Classes Challenge:A Retrospective [J].International Journal of Computer Vision,2015,111(1):98-136.
[22]HARIHARAN B,BOURDEV L,ARBELAEZ P,MALIK J,et al.Semantic Contours from Inverse Detectors[C]∥Proceedings of IEEE International Conference on Computer Vision.Barcelona:IEEE Press,2011:991-998.
[23]MOTTAGHI R,CHEN X,LIU X,et al.The Role of Context for Object Detection and Semantic Segmentation in the Wild[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington DC:ACM Press,2014:891-898.
[24]ABDULNABI A H,SHUAI B,WINKLER S,et al.Episodic CAMN:Contextual Attention-Based Memory Networks with Iterative Feedback for Scene Labeling[C]∥Proceedings ofIEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii,USA:IEEE Press,2018:6278-6287.
[25]WU Z,SHEN C,ANTONV D H.Bridging Category-Level and Instance-Level Semantic Image Segmentation[J].International Journal of Computer Vision,2016,111(1):140-155.
[26]ZHENG S,JAYASUMANA S,VINEET V,et al.Conditional Random Fields as Recurrent Neural Networks[C]∥Proceedings of IEEE International Conference on Computer Vision.Santiago,Chile:IEEE Press,2015:1529-1537.

Related Articles 15

[1]	ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[2]	CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[3]	CHEN Yong-ping, ZHU Jian-qing, XIE Yi, WU Han-xiao, ZENG Huan-qiang. Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss [J]. Computer Science, 2022, 49(6A): 424-428.
[4]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[5]	YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[6]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[7]	LAN Ling-xiang, CHI Ming-min. Remote Sensing Change Detection Based on Feature Fusion and Attention Network [J]. Computer Science, 2022, 49(6): 193-198.
[8]	LI Fa-guang, YILIHAMU·Yaermaimaiti. Real-time Detection Model of Insulator Defect Based on Improved CenterNet [J]. Computer Science, 2022, 49(5): 84-91.
[9]	DONG Qi-da, WANG Zhe, WU Song-yang. Feature Fusion Framework Combining Attention Mechanism and Geometric Information [J]. Computer Science, 2022, 49(5): 129-134.
[10]	LI Peng-zu, LI Yao, Ibegbu Nnamdi JULIAN, SUN Chao, GUO Hao, CHEN Jun-jie. Construction and Classification of Brain Function Hypernetwork Based on Overlapping Group Lasso with Multi-feature Fusion [J]. Computer Science, 2022, 49(5): 206-211.
[11]	FAN Xin-nan, ZHAO Zhong-xin, YAN Wei, YAN Xi-jun, SHI Peng-fei. Multi-scale Feature Fusion Image Dehazing Algorithm Combined with Attention Mechanism [J]. Computer Science, 2022, 49(5): 50-57.
[12]	GAO Xin-yue, TIAN Han-min. Droplet Segmentation Method Based on Improved U-Net Network [J]. Computer Science, 2022, 49(4): 227-232.
[13]	XU Tao, CHEN Yi-ren, LYU Zong-lei. Study on Reflective Vest Detection for Apron Workers Based on Improved YOLOv3 Algorithm [J]. Computer Science, 2022, 49(4): 239-246.
[14]	XU Hua-jie, QIN Yuan-zhuo, YANG Yang. Scene Recognition Method Based on Multi-level Feature Fusion and Attention Module [J]. Computer Science, 2022, 49(4): 209-214.
[15]	YANG Xiao-yu, YIN Kang-ning, HOU Shao-qi, DU Wen-yi, YIN Guang-qiang. Person Re-identification Based on Feature Location and Fusion [J]. Computer Science, 2022, 49(3): 170-178.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Image Semantic Segmentation Based on Deep Feature Fusion

PDF (PC)