计算机科学 ›› 2022, Vol. 49 ›› Issue (11): 126-133.doi: 10.11896/jsjkx.220500193
金玉杰1,2, 初旭1,3, 王亚沙1,4, 赵俊峰1,2
JIN Yu-jie1,2, CHU Xu1,3, WANG Ya-sha1,4, ZHAO Jun-feng1,2
摘要: 街景语义分割技术旨在从图像中识别分割出行人、障碍物、道路、标志物等要素,为车辆提供道路上自由空间的信息,是自动驾驶的关键技术之一。高性能的语义分割系统非常依赖于训练时所需的大量真实标注数据,然而为图像中的每个像素进行标注成本很高,往往难以实现。一种低成本获取标注数据的方法是利用视频游戏收集逼真且标注成本低的合成图片,来帮助机器学习模型对现实世界中的图片作语义分割,这对应域适配技术。与当前基于VC维理论或Rademacher复杂度理论的主流语义分割域适配方法不同,受基于PAC-Bayes理论的兼容伪标签函数的域适配目标域Gibbs风险上界启发,考虑假设空间的平均情况而非最差情况,以避免主流方法过度约束隐空间上的领域差异,从而导致目标域泛化误差上界未能被有效估计并优化的问题。在上述思想的指导下,提出了一种变分推断语义分割域适配方法(VISA),该方法在利用Dropout变分族进行变分推断求解假设空间上的理想后验分布的同时能快速得到一个近似Bayes分类器,并通过目标域熵最小化和筛选像素点使得对风险上界的估计更加准确。在街景语义分割数据集GTA5→Cityscapes上的适配的实验结果表明,VISA方法相比基线方法平均交并比提高了0.5%~6.6%,且在行人、车辆等关键街景要素上具有较高的识别准确率。
中图分类号:
| [1]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890. [2]YU F,KOLTUNV,FUNKHOUSER T.Dilated residual net-works[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:472-480. [3]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848. [4]RICHTER S R,VINEET V,ROTH S,et al.Playing for data:Ground truth from computer games[C]//European Conference on ComputerVision.Cham:Springer,2016:102-118. [5]HOFFMAN J,WANG D,YU F,et al.Fcns in the wild:Pixel-level adversarial and constraint-based adaptation[C]//CoRR.2016. [6]ZHANG Y,DAVID P,GONG B.Curriculum domain adaptation for semantic segmentation of urban scenes[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2020-2030. [7]WANG M,DENG W.Deep visual domain adaptation:A survey[J].Neurocomputing,2018,312:135-153. [8]TSAI Y H,HUNG W C,SCHULTER S,et al.Learning toadapt structured output space for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7472-7481. [9]VU T H,JAIN H,BUCHER M,et al.Advent:Adversarial entropy minimization for domain adaptation in semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2517-2526. [10]BEN-DAVID S,BLITZER J,CRAMMER K,et al.Analysis of representations for domain adaptation[C]//Advances in Neural Information Processing Systems 19:Proceedings of the 2006 Conference.MIT Press,2006. [11]MANSOUR Y,MOHRI M,ROSTAMIZADEH A.Domainadaptation:Learning bounds and algorithms[J].arXiv:0902.3430,2009. [12]LIU H,LONG M,WANG J,et al.Transferable adversarialtraining:A general approach to adapting deep classifiers[C]//International Conference on Machine Learning.PMLR,2019:4013-4022. [13]JIN C,NETRAPALLI P,JORDAN M.What is local optimality in nonconvex-nonconcave minimax optimization?[C]//International Conference on Machine Learning.PMLR,2020:4880-4889. [14]CHU X.Feature Map Sharing towards High-dimensional Un-der-Labeled Data Analysis[D].Beijing:Peking University,2021. [15]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition.2015:3431-3440. [16]LONG M,CAO Y,WANG J,et al.Learning transferable fea-tures with deep adaptation networks[C]//International Confe-rence on Machine Learning.PMLR,2015:97-105. [17]GANIN Y,USTINOVA E,AJAKAN H,et al.Domain-adversa-rial training of neural networks[J].TheJournal of Machine Learning Research,2016,17(1):2096-2030. [18]JUDY H,ERIC T,TAESUNG P,et al.Cycada:Cycle-consistent adversarial domain adaptation[C]//Proceedings of the 35th International Conference on Machine Learning.2018. [19]GONG R,LI W,CHEN Y,et al.Dlow:Domain flow for adaptation and generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2477-2486. [20]CHEN Y,LI W,VAN GOOL L.Road:Reality oriented adaptation for semantic segmentation of urban scenes[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7892-7901. [21]SUN B,SAENKO K.From Virtual to Reality:Fast Adaptation of Virtual Object Detectors to Real Domains[C]//BMVC.2014. [22]VAZQUEZ D,LOPEZ A M,MARIN J,et al.Virtual and real world adaptation for pedestrian detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,36(4):797-809. [23]GERMAIN P,HABRARD A,LAVIOLETTE F,et al.A PAC-Bayesian approach for domain adaptation with specialization to linear classifiers[C]//International Conference on Machine Learning.PMLR,2013:738-746. [24]GERMAIN P,HABRARD A,LAVIOLETTE F,et al.PAC-Bayes and domain adaptation[J].Neurocomputing,2020,379:379-397. [25]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal lossfor dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988. [26]CHEN M,XUE H,CAI D.Domain adaptation for semantic segmentation with maximum squares loss[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:2090-2099. [27]CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223. [28]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255. [29]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [30]LUO Y,LIU P,GUAN T,et al.Significance-aware information bottleneck for domain adaptive semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6778-6787. | 
| [1] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 | 
| [2] | 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇. 深度卷积神经网络图像实例分割方法研究进展 Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network 计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038 | 
| [3] | 焦翔, 魏祥麟, 薛羽, 王超, 段强. 基于深度学习的自动调制识别研究 Automatic Modulation Recognition Based on Deep Learning 计算机科学, 2022, 49(5): 266-278. https://doi.org/10.11896/jsjkx.211000085 | 
| [4] | 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨. 基于国产众核处理器的深度神经网络算子加速库优化 Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor 计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226 | 
| [5] | 程恺, 刘满, 王之腾, 毛绍臣, 申秋慧, 张宏军. 基于全局属性注意力神经过程模型的数据补全研究 Study on Data Filling Based on Global-attributes Attention Neural Process Model 计算机科学, 2022, 49(10): 111-117. https://doi.org/10.11896/jsjkx.210800038 | 
| [6] | 范红杰, 李雪冬, 叶松涛. 面向电子病历语义解析的疾病辅助诊断方法 Aided Disease Diagnosis Method for EMR Semantic Analysis 计算机科学, 2022, 49(1): 153-158. https://doi.org/10.11896/jsjkx.201100125 | 
| [7] | 王施云, 杨帆. 基于U-Net特征融合优化策略的遥感影像语义分割方法 Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy 计算机科学, 2021, 48(8): 162-168. https://doi.org/10.11896/jsjkx.200700182 | 
| [8] | 周欣, 刘硕迪, 潘薇, 陈媛媛. 自然交通场景中的车辆颜色识别 Vehicle Color Recognition in Natural Traffic Scene 计算机科学, 2021, 48(6A): 15-20. https://doi.org/10.11896/jsjkx.200800078 | 
| [9] | 刘东, 王叶斐, 林建平, 马海川, 杨闰宇. 端到端优化的图像压缩技术进展 Advances in End-to-End Optimized Image Compression Technologies 计算机科学, 2021, 48(3): 1-8. https://doi.org/10.11896/jsjkx.201100134 | 
| [10] | 詹瑞, 雷印杰, 陈训敏, 叶书函. 基于多重差异特征网络的街景变化检测 Street Scene Change Detection Based on Multiple Difference Features Network 计算机科学, 2021, 48(2): 142-147. https://doi.org/10.11896/jsjkx.200500158 | 
| [11] | 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松. 基于网络表示学习的深度社团发现方法 Deep Community Detection Algorithm Based on Network Representation Learning 计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113 | 
| [12] | 王鑫, 张昊宇, 凌诚. 基于U-Net优化的SAR遥感图像语义分割 Semantic Segmentation of SAR Remote Sensing Image Based on U-Net Optimization 计算机科学, 2021, 48(11A): 376-381. https://doi.org/10.11896/jsjkx.210300260 | 
| [13] | 朱戎, 叶宽, 杨博, 谢欢, 赵蕾. 基于改进DeeplabV3+的地物分类方法研究 Feature Classification Method Based on Improved DeeplabV3+ 计算机科学, 2021, 48(11A): 382-385. https://doi.org/10.11896/jsjkx.201100184 | 
| [14] | 马琳, 王云霄, 赵丽娜, 韩兴旺, 倪金超, 张婕. 基于多模型判别的网络入侵检测系统 Network Intrusion Detection System Based on Multi-model Ensemble 计算机科学, 2021, 48(11A): 592-596. https://doi.org/10.11896/jsjkx.201100170 | 
| [15] | 刘天星, 李伟, 许铮, 张立华, 戚骁亚, 甘中学. 面向高维连续行动空间的蒙特卡罗树搜索算法 Monte Carlo Tree Search for High-dimensional Continuous Control Space 计算机科学, 2021, 48(10): 30-36. https://doi.org/10.11896/jsjkx.201000129 | 
| 
 | ||