计算机科学 ›› 2023, Vol. 50 ›› Issue (4): 103-109.doi: 10.11896/jsjkx.220100259
李杨, 韩屏
LI Yang, HAN Ping
摘要: 人体解析是一项细粒度级别的语义分割任务,随着人体解析数据集中标注类别的精细化,人体解析数据集呈长尾分布,导致对相似类别的识别难度不断增大。均衡采样是解决长尾分布问题的有效方法。针对人体解析任务中难以对标注目标进行均衡采样和模型对相似类别的误判率增加等问题,文中提出了一种结合区域采样和类间损失的人体解析模型,该模型包含语义分割网络、区域均衡采样模块(Regionally Balanced Sampling Module,RBSM)和类间损失模块(Inter-class Loss Module,ILM)3个部分。首先将待解析图片送入语义分割网络得到初步预测结果,RBSM对初步的预测结果和真实标签进行采样,对采样后的预测结果和真实标签计算主损失;同时提取出语义分割网络的最后一层特征热图与真实标签,并将其送入ILM计算类间损失,让模型同时优化主损失和类间损失,最终得到精度更高的模型。在MHPv2.0数据集上的实验结果表明,该模型在不更改原有语义分割网络结构的基础上将mIoU评测指标提高了1.3%以上,有效缓解了长尾分布和类间的相似性给人体解析带来的影响。
中图分类号:
[1]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,USA:IEEE,2015:3431-3440. [2]LIANG X,GONG K,SHEN X,et al.Look into person:Jointbody parsing & pose estimation network and a new benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(4):871-885. [3]RUAN T,LIU T,HUANG Z,et al.Devil in the details:To-wards accurate single and multiple human parsing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Honolulu,USA:AAAI,2019,33(1):4814-4821. [4]LI P,XU Y,WEI Y,et al.Self-correction for human parsing[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(6):3260-3271. [5]LIU K,CHOI O,WANG J,et al.CDGNet:Class DistributionGuided Network for Human Parsing[J]. arXiv:2111.14173,2021. [6]LIANG X,LIU S,SHEN X,et al.Deep human parsing with active template regression[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(12):2402-2414. [7]GONG K,LIANG X,ZHANG D,et al.Look into person:Self-supervised structure-sensitive learning and a new benchmark for human parsing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:932-940. [8]GONG K,LIANG X,LI Y,et al.Instance-level human parsing via part grouping network[C]//Proceedings of the European Conference on Computer Vision(ECCV).Berlin,German:Springer,2018:770-785. [9]YAMAGUCHI K,KIAPOUR M H,ORTIZ L E,et al.Parsing clothing in fashion photographs[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.Providence USA:IEEE,2012:3570-3577. [10]ZHAO J,LI J,CHENG Y,et al.Understanding humans incrowded scenes:Deep nested adversarial learning and a new benchmark for multi-human parsing[C]//Proceedings of the 26th ACM International Conference on Multimedia.New York,NY:ACM,2018:792-800. [11]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:9268-9277. [12]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:2980-2988. [13]LIU Z,MIAO Z,ZHAN X,et al.Large-scale long-tailed recognition in an open world[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:2537-2546. [14]WANG J,ZHANG W,ZANG Y,et al.Seesaw loss for long-tailed instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville,USA:IEEE,2021:9695-9704. [15]BULÒ S R,NEUHOLD G,KONTSCHIEDER P.Loss max-pooling for semantic image segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:2126-2135. [16]ZHOU B,CUI Q,WEI X S,et al.Bbn:Bilateral-branch network with cumulative learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle,USA:IEEE,2020:9719-9728. [17]KANG B,XIE S,ROHRBACH M,et al.Decoupling representation and classifier for long-tailed recognition[J].arXiv:1910.09217,2019. [18]WEN Y,ZHANG K,LI Z,et al.A discriminative feature lear-ning approach for deep face recognition[C]//European Confe-rence on Computer Vision.Berlin,German:Springer,2016:499-515. [19]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Berlin,German:Springer,2015:234-241. [20]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:2881-2890. [21]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [22]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer vision(ECCV).Berlin,German:Springer,2018:801-818. [23]FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:3146-3154. [24]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.Miami,USA:IEEE,2009:248-255. |
[1] | 马玮琦, 袁家斌, 查可可, 范利利. 一种基于脉冲神经网络的星体表面岩石检测算法 Onboard Rock Detection Algorithm Based on Spiking Neural Network 计算机科学, 2023, 50(1): 98-104. https://doi.org/10.11896/jsjkx.211100149 |
[2] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[3] | 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇. 深度卷积神经网络图像实例分割方法研究进展 Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network 计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038 |
[4] | 金玉杰, 初旭, 王亚沙, 赵俊峰. 变分推断域适配驱动的城市街景语义分割 Variational Domain Adaptation Driven Semantic Segmentation of Urban Scenes 计算机科学, 2022, 49(11): 126-133. https://doi.org/10.11896/jsjkx.220500193 |
[5] | 王施云, 杨帆. 基于U-Net特征融合优化策略的遥感影像语义分割方法 Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy 计算机科学, 2021, 48(8): 162-168. https://doi.org/10.11896/jsjkx.200700182 |
[6] | 詹瑞, 雷印杰, 陈训敏, 叶书函. 基于多重差异特征网络的街景变化检测 Street Scene Change Detection Based on Multiple Difference Features Network 计算机科学, 2021, 48(2): 142-147. https://doi.org/10.11896/jsjkx.200500158 |
[7] | 王鑫, 张昊宇, 凌诚. 基于U-Net优化的SAR遥感图像语义分割 Semantic Segmentation of SAR Remote Sensing Image Based on U-Net Optimization 计算机科学, 2021, 48(11A): 376-381. https://doi.org/10.11896/jsjkx.210300260 |
[8] | 朱戎, 叶宽, 杨博, 谢欢, 赵蕾. 基于改进DeeplabV3+的地物分类方法研究 Feature Classification Method Based on Improved DeeplabV3+ 计算机科学, 2021, 48(11A): 382-385. https://doi.org/10.11896/jsjkx.201100184 |
[9] | 任天赐, 黄向生, 丁伟利, 安重阳, 翟鹏博. 全局双边网络的语义分割算法 Global Bilateral Segmentation Network for Segmantic Segmentation 计算机科学, 2020, 47(6A): 161-165. https://doi.org/10.11896/JsJkx.191200127 |
[10] | 刘彬, 刘宏哲. 基于改进Enet网络的车道线检测算法 Lane Detection Algorithm Based on Improved Enet Network 计算机科学, 2020, 47(4): 142-149. https://doi.org/10.11896/jsjkx.190500021 |
[11] | 周鹏程,龚声蓉,钟珊,包宗铭,戴兴华. 基于深度特征融合的图像语义分割 Image Semantic Segmentation Based on Deep Feature Fusion 计算机科学, 2020, 47(2): 126-134. https://doi.org/10.11896/jsjkx.190100119 |
[12] | 王赛男, 郑雄风. 基于边缘计算的图像语义分割应用与研究 Application and Research of Image Semantic Segmentation Based on Edge Computing 计算机科学, 2020, 47(11A): 276-280. https://doi.org/10.11896/jsjkx.200900046 |
[13] | 杨培健, 吴晓富, 张索非, 周全. 基于空洞卷积鉴别器的语义分割迁移算法 Semantic Segmentation Transfer Algorithm Based on Atrous Convolution Discriminator 计算机科学, 2020, 47(11): 174-178. https://doi.org/10.11896/jsjkx.191100014 |
[14] | 王嫣然, 陈清亮, 吴俊君. 面向复杂环境的图像语义分割方法综述 Research on Image Semantic Segmentation for Complex Environments 计算机科学, 2019, 46(9): 36-46. https://doi.org/10.11896/j.issn.1002-137X.2019.09.005 |
[15] | 缪永伟, 李高怡, 鲍陈, 张旭东, 彭思龙. 基于卷积神经网络的图像局部风格迁移 Image Localized Style Transfer Based on Convolutional Neural Network 计算机科学, 2019, 46(9): 259-264. https://doi.org/10.11896/j.issn.1002-137X.2019.09.039 |
|