计算机科学 ›› 2020, Vol. 47 ›› Issue (11): 174-178.doi: 10.11896/jsjkx.191100014

• 计算机图形学&多媒体 • 上一篇    下一篇

基于空洞卷积鉴别器的语义分割迁移算法

杨培健1, 吴晓富1, 张索非2, 周全1   

  1. 1 南京邮电大学通信与信息工程学院 南京 210003
    2 南京邮电大学物联网学院 南京 210003
  • 收稿日期:2019-11-03 修回日期:2020-06-06 出版日期:2020-11-15 发布日期:2020-11-05
  • 通讯作者: 吴晓富(xfuwu@njupt.edu.cn)
  • 作者简介:peijiany@163.com
  • 基金资助:
    国家自然科学基金(61372123,61701252)

Semantic Segmentation Transfer Algorithm Based on Atrous Convolution Discriminator

YANG Pei-jian1, WU Xiao-fu1, ZHANG Suo-fei2, ZHOU Quan1   

  1. 1 School of Telecommunication and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
    2 School of Internet of Things,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
  • Received:2019-11-03 Revised:2020-06-06 Online:2020-11-15 Published:2020-11-05
  • About author:YANG Pei-jian,born in 1995,postgra-duate.His main research interests include semantic segmentation and transfer learning.
    WU Xiao-fu,born in 1975,Ph.D,professor.His main research interests include computer vision,face recognition and transfer learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61372123,61701252).

摘要: 近年来,基于卷积神经网络的有监督图像语义分割方法的研究取得了巨大进展。针对该方法所依赖的手动标签繁琐、费时的问题,一种流行的解决方法是通过游戏视频来收集类似于真实场景的图像并自动生成标签,随后利用迁移学习将合成场景训练的模型迁移到真实场景。由于域偏移,简单地将合成场景(源域)上学习的模型应用到真实场景(目标域)一般会出现较高的泛化误差。针对该问题,提出一种新的图像语义分割的无监督迁移算法。该算法首先基于传统的图像风格转换网络对源域图像集进行风格转换预处理,使得图像风格能对齐于目标域,有效降低域间差异;然后,采用生成对抗训练实现源域与目标域特征的对齐。针对现有生成对抗训练中鉴别网络视野受限的问题,提出通过空洞卷积来设计鉴别网络,从而有效提升鉴别网络的分辨能力。在两个典型城市道路数据集 GTA5以及SYNTHIA上的实验表明:相比于经典的AdaptSegNet算法,所提算法在 GTA5 数据集上的平均交并比(mIoU)提高了 4.5%,在 SYNTHIA数据集上的平均交并比提高了2.6%。

关键词: 空洞卷积, 迁移学习, 深度学习, 生成对抗网络, 语义分割, 域适应

Abstract: Supervised semantic segmentation with convolutional neural networks has made great progress in recent years.Since the pix-level labeling required by supervised sematic segmentation is tedious and labor intensive,one way that becomes recently prevalent is to collect photo-realistic synthetic data from video games,where pixel-level annotation can be automatically generated.Despite this,the intrinsic domain difference between synthetic and real images usually causes a significant performance drop when applying the learned model to real world scenarios.To solve this problem,we propose a novel domain adaptive semantic segmentation method.It firstly performs image style conversion over the source domain for reducing the domain difference.Then,the generative adversarial network is employed for feature alignment between source and target domains.In particular,we propose to use the atrous convolution for constructing the powerful discriminator network with the enlarged field of view.Extensive experiments show that the proposed algorithm can achieve 4.5% mIoU improvement on the GTA5 dataset and 2.6% on the SYNTHIA dataset,compared with the classic AdaptSegNet algorithm.

Key words: Atrous convolution, Deep learning, Domain adaptation, Generative adversarial network, Semantic segmentation, Transfer learning

中图分类号: 

  • TP391
[1] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[2] LONG J,SHELHAMER E,DARRELL T.Fully convolutionalnetworks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[3] LIU Z,LI X,LUO P,et al.Semantic image segmentation viadeep parsing network[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1377-1385.
[4] LIN G,SHEN C,VAN DEN HENGEL A,et al.Efficient piecewise training of deep structured models for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3194-3203.
[5] ZHENG S,JAYASUMANA S,ROMERA-PAREDES B,et al.Conditional random fields as recurrent neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1529-1537.
[6] ROS G,SELLART L,MATERZYNSKA J,et al.The synthiadataset:A large collection of synthetic images for semantic segmentation of urban scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3234-3243.
[7] RICHTER S R,VINEET V,ROTH S,et al.Playing for data:Ground truth from computer games[C]//European Conference on Computer Vision.Cham:Springer,2016:102-118.
[8] CARIUCCI F M,PORZI L,CAPUTO B,et al.Autodial:Automatic domain alignment layers[C]//2017 IEEE International Conference on Computer Vision (ICCV).IEEE,2017:5077-5085.
[9] MANCINI M,PORZI L,ROTA BULÒ S,et al.Boosting domainadaptation by discovering latent domains[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3771-3780.
[10] HOFFMAN J,WANG D,YU F,et al.Fcns in the wild:Pixel-level adversarial and constraint-based adaptation[J].arXiv:1612.02649,2016.
[11] SANKARANARAYANAN S,BALAJI Y,JAIN A,et al.Learning from synthetic data:Addressing domain shift for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3752-3761.
[12] TSAI Y H,HUNG W C,SCHULTER S,et al.Learning toadapt structured output space for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7472-7481.
[13] ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2223-2232.
[14] HOFFMAN J,TZENG E,PARK T,et al.Cycada:Cycle-consistent adversarial domain adaptation[J].arXiv:1711.03213,2017.
[15] WU Z,HAN X,LIN Y L,et al.Dcan:Dual channel-wise alignment networks for unsupervised scene adaptation[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:518-534.
[16] CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223.
[17] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[18] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 张佳, 董守斌.
基于评论方面级用户偏好迁移的跨领域推荐算法
Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer
计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[6] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 方义秋, 张震坤, 葛君伟.
基于自注意力机制和迁移学习的跨领域推荐算法
Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning
计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[14] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[15] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!