Computer Science ›› 2023, Vol. 50 ›› Issue (4): 103-109.doi: 10.11896/jsjkx.220100259

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Human Parsing Model Combined with Regional Sampling and Inter-class Loss

LI Yang, HAN Ping   

  1. School of Information Engineering,Wuhan University of Technology,Wuhan 430070,China
  • Received:2022-01-27 Revised:2022-06-23 Online:2023-04-15 Published:2023-04-06
  • About author:LI Yang,born in 1998,postgraduate.His main research interests include deep learning and semantic segmentation.
    HAN Ping,born in 1980,Ph.D, asso-ciated professor.His main research interests include deep learning,computer vision,and embedded system.
  • Supported by:
    Fundamental Research Funds for the Central Universities(WUT:2018III069GX).

Abstract: Human parsing is a fine-grained level semantic segmentation task.The refinement of annotated categories in the human parsing dataset makes the dataset follow a long-tailed distribution and improves the difficulty of identifying similar categories.Balanced sampling is an efficient way to solve long-tailed distribution problem,but it’s difficult to achieve balanced sampling of the labeled object in human parsing.On the other hand,the fine-grained annotation will make the model misjudge similar categories.In response to these problems,a human parsing model combined with regional sampling and inter-class loss is proposed.The model consists of the semantic segmentation network,regionally balanced sampling module(RBSM),and inter-class loss module(ILM).Firstly,the images are parsed by the semantic segmentation network.Next,the parsing results and the ground truth labels are sampled by regionally balanced sampling module.Then the sampled parsing results and sampled ground truth labels are utilized to calculate the master loss.Meanwhile,the inter-class loss between the heatmap features coming from the semantic segmentation network and ground truth labels are calculated in the inter-class loss module,and the master loss and the inter-class loss are optimized at the same time to get a more accurate model.Experimental results based on the MHPv2.0 dataset show that the mIoU of the proposed model improves by more than 1.3% without changing the structure of the semantic segmentation network.The algorithm effectively reduces the impact of the long tail distribution problem and similarity among categories.

Key words: Regional sampling, Inter-loss, Long-tailed distribution, Human parsing, Semantic segmentation

CLC Number: 

  • TP391
[1]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,USA:IEEE,2015:3431-3440.
[2]LIANG X,GONG K,SHEN X,et al.Look into person:Jointbody parsing & pose estimation network and a new benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(4):871-885.
[3]RUAN T,LIU T,HUANG Z,et al.Devil in the details:To-wards accurate single and multiple human parsing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Honolulu,USA:AAAI,2019,33(1):4814-4821.
[4]LI P,XU Y,WEI Y,et al.Self-correction for human parsing[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(6):3260-3271.
[5]LIU K,CHOI O,WANG J,et al.CDGNet:Class DistributionGuided Network for Human Parsing[J]. arXiv:2111.14173,2021.
[6]LIANG X,LIU S,SHEN X,et al.Deep human parsing with active template regression[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(12):2402-2414.
[7]GONG K,LIANG X,ZHANG D,et al.Look into person:Self-supervised structure-sensitive learning and a new benchmark for human parsing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:932-940.
[8]GONG K,LIANG X,LI Y,et al.Instance-level human parsing via part grouping network[C]//Proceedings of the European Conference on Computer Vision(ECCV).Berlin,German:Springer,2018:770-785.
[9]YAMAGUCHI K,KIAPOUR M H,ORTIZ L E,et al.Parsing clothing in fashion photographs[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.Providence USA:IEEE,2012:3570-3577.
[10]ZHAO J,LI J,CHENG Y,et al.Understanding humans incrowded scenes:Deep nested adversarial learning and a new benchmark for multi-human parsing[C]//Proceedings of the 26th ACM International Conference on Multimedia.New York,NY:ACM,2018:792-800.
[11]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:9268-9277.
[12]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:2980-2988.
[13]LIU Z,MIAO Z,ZHAN X,et al.Large-scale long-tailed recognition in an open world[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:2537-2546.
[14]WANG J,ZHANG W,ZANG Y,et al.Seesaw loss for long-tailed instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville,USA:IEEE,2021:9695-9704.
[15]BULÒ S R,NEUHOLD G,KONTSCHIEDER P.Loss max-pooling for semantic image segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:2126-2135.
[16]ZHOU B,CUI Q,WEI X S,et al.Bbn:Bilateral-branch network with cumulative learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle,USA:IEEE,2020:9719-9728.
[17]KANG B,XIE S,ROHRBACH M,et al.Decoupling representation and classifier for long-tailed recognition[J].arXiv:1910.09217,2019.
[18]WEN Y,ZHANG K,LI Z,et al.A discriminative feature lear-ning approach for deep face recognition[C]//European Confe-rence on Computer Vision.Berlin,German:Springer,2016:499-515.
[19]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Berlin,German:Springer,2015:234-241.
[20]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,USA:IEEE,2017:2881-2890.
[21]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017.
[22]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer vision(ECCV).Berlin,German:Springer,2018:801-818.
[23]FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:3146-3154.
[24]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.Miami,USA:IEEE,2009:248-255.
[1] MA Weiqi, YUAN Jiabin, ZHA Keke, FAN Lili. Onboard Rock Detection Algorithm Based on Spiking Neural Network [J]. Computer Science, 2023, 50(1): 98-104.
[2] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[3] HU Fu-yuan, WAN Xin-jun, SHEN Ming-fei, XU Jiang-lang, YAO Rui, TAO Zhong-ben. Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 10-24.
[4] JIN Yu-jie, CHU Xu, WANG Ya-sha, ZHAO Jun-feng. Variational Domain Adaptation Driven Semantic Segmentation of Urban Scenes [J]. Computer Science, 2022, 49(11): 126-133.
[5] WANG Shi-yun, YANG Fan. Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy [J]. Computer Science, 2021, 48(8): 162-168.
[6] ZHAN Rui, LEI Yin-jie, CHEN Xun-min, YE Shu-han. Street Scene Change Detection Based on Multiple Difference Features Network [J]. Computer Science, 2021, 48(2): 142-147.
[7] WANG Xin, ZHANG Hao-yu, LING Cheng. Semantic Segmentation of SAR Remote Sensing Image Based on U-Net Optimization [J]. Computer Science, 2021, 48(11A): 376-381.
[8] ZHU Rong, YE Kuan, YANG Bo, XIE Huan, ZHAO Lei. Feature Classification Method Based on Improved DeeplabV3+ [J]. Computer Science, 2021, 48(11A): 382-385.
[9] REN Tian-ci, HUANG Xiang-sheng, DING Wei-li, AN Chong-yang and ZHAI Peng-bo. Global Bilateral Segmentation Network for Segmantic Segmentation [J]. Computer Science, 2020, 47(6A): 161-165.
[10] ZHOU Peng-cheng,GONG Sheng-rong,ZHONG Shan,BAO Zong-ming,DAI Xing-hua. Image Semantic Segmentation Based on Deep Feature Fusion [J]. Computer Science, 2020, 47(2): 126-134.
[11] WANG Sai-nan, ZHENG Xiong-feng. Application and Research of Image Semantic Segmentation Based on Edge Computing [J]. Computer Science, 2020, 47(11A): 276-280.
[12] YANG Pei-jian, WU Xiao-fu, ZHANG Suo-fei, ZHOU Quan. Semantic Segmentation Transfer Algorithm Based on Atrous Convolution Discriminator [J]. Computer Science, 2020, 47(11): 174-178.
[13] MIAO Yong-wei, LI Gao-yi, BAO Chen, ZHANG Xu-dong, PENG Si-long. Image Localized Style Transfer Based on Convolutional Neural Network [J]. Computer Science, 2019, 46(9): 259-264.
[14] WANG Yan-ran, CHEN Qing-liang, WU Jun-jun. Research on Image Semantic Segmentation for Complex Environments [J]. Computer Science, 2019, 46(9): 36-46.
[15] ZHANG Li and YU Lei. Who Can Collaborate New Users in Recommendation System? [J]. Computer Science, 2015, 42(Z11): 80-82.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!