计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 220100057-5.doi: 10.11896/jsjkx.220100057
马皖宜, 张德平
MA Wan-yi, ZHANG De-ping
摘要: 针对人体姿态估计中人体与背景区分度不高,基于HRNet网络的人体姿态估计中重要特征信息利用不完全的问题,利用通道与空间注意力机制,提出了一种基于多尺度双注意力(Multiscale Dual Attention,MDA)的人体姿态估计方法MDA-HRNet。该方法从通道域和空间域出发,分别设计了结合通道注意力的Ca-Neck,Ca-Block模块和结合空间注意力的Sa-Block模块,将其融入到高分辨率网络结构中,使网络能够重点关注图像中的人体区域。在Sa-Block模块中采用3×3和7×7的卷积核推导两种不同尺度的空间注意力映射,使网络区分人体特征和背景特征的能力更加显著,从而对人体及其关键点进行准确定位。该方法在MPII数据集上进行了实验验证,结果表明MDA-HRNet能有效地提高人体姿态估计关节点定位的准确度。
中图分类号:
[1]ZHOU Y,LIU Z Q,ZENG F Z,et al.Survey on Two-dimensional Human Pose Estimation of Deep Learning[J].Journal of Frontiers of Computer Science and Technology,2021,15(4):641-657. [2]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [3]NEWELL A,YANG K,JIA D.Stacked Hourglass Networks for Human Pose Estimation[C]//European Conference on Compu-ter Vision.Springer International Publishing,2016. [4]SUN K,XIAO B,LIU D,et al.Deep High-Resolution Representation Learning for Human Pose Estimation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019. [5]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Networks[J].Advances in Neural Information Processing Systems,2014,3:2672-2680. [6]HAO S,LEE D H,ZHAO D.Sequence to sequence learningwith attention mechanism for short-term passenger flow prediction in large-scale metro system[J].Transportation Research Part C:Emerging Technologies,2019,107:287-300. [7]ZILLICH M,FRINTROP S,PIRRI F,et al.Workshop on attention models in robotics:visual systems for better HRI[C]//Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction.New York:ACM,2014:499-500. [8]JIE H,LI S,GANG S.Squeeze-and-Excitation Networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018. [9]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//European Conference on Computer Vision.2018. [10]JADERBERG M,SIMONYAN K,ZISSERMAN A.Spatialtransformer networks[J].Advances in Neural Information Processing Systems,2015,28:2017-2025. [11]ALMAHAIRI A,BALLAS N,COOIJMANS T,et al.Dynamic capacity networks[C]//International Conference on Machine Learning.PMLR,2016:2549-2558. [12]SZEGEDY C,WEI L,JIA Y,et al.Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2015. [13]ANDRILUKA M,PISHCHULIN L,GEHLER P,et al.Human Pose Estimation:New Benchmark and State of the Art Analysis[C]//Computer Vision and Pattern Recognition(CVPR).IEEE,2014. [14]CHEN Y,WANG Z,PENG Y,et al.Cascaded Pyramid Network for Multi-person Pose Estimation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018. [15]YANG W,LI S,OUYANG W,et al.Learning feature pyramids for human pose estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1281-1290. [16]TANG W,YU P,WU Y.Deeply learned compositional models for human pose estimation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:190-206. [17]INSAFUTDINOV E,PISHCHULIN L,ANDRES B,et al.Deepercut:A deeper,stronger,and faster multi-person pose estimation model[C]//European Conference on Computer Vision.Cham:Springer,2016:34-50. [18]XIAO B,WU H,WEI Y.Simple baselines for human pose estimation and tracking[C]//Proceedings of the European Confe-rence on Computer Vision(ECCV).2018:466-481. |
[1] | 杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236 |
[2] | 沈超, 何希平. 基于纹理特征增强和轻量级网络的人脸防伪算法 Face Anti-spoofing Algorithm Based on Texture Feature Enhancement and Light Neural Network 计算机科学, 2022, 49(6A): 390-396. https://doi.org/10.11896/jsjkx.210600217 |
[3] | 邵延华, 李文峰, 张晓强, 楚红雨, 饶云波, 陈璐. 基于时空图卷积和注意力模型的航拍暴力行为识别 Aerial Violence Recognition Based on Spatial-Temporal Graph Convolutional Networks and Attention Model 计算机科学, 2022, 49(6): 254-261. https://doi.org/10.11896/jsjkx.210400272 |
[4] | 张瑛, 聂仁灿, 马朝振, 余仕双. 基于子空间特征相互学习的MRI与PET/SPECT图像融合 MRI and PET/SPECT Image Fusion Based on Subspace Feature Mutual Learning 计算机科学, 2022, 49(11A): 211000171-6. https://doi.org/10.11896/jsjkx.211000171 |
[5] | 何鹏浩, 余映, 徐超越. 基于动态金字塔和子空间注意力的图像超分辨率重建网络 Image Super-resolution Reconstruction Network Based on Dynamic Pyramid and Subspace Attention 计算机科学, 2022, 49(11A): 210900202-8. https://doi.org/10.11896/jsjkx.210900202 |
[6] | 杨连平, 孙玉波, 张红良, 李封, 张祥德. 基于编解码残差的人体关键点匹配网络 Human Keypoint Matching Network Based on Encoding and Decoding Residuals 计算机科学, 2020, 47(6): 114-120. https://doi.org/10.11896/jsjkx.200300079 |
[7] | 李天培, 陈黎. 基于双注意力编码-解码器架构的视网膜血管分割 Retinal Vessel Segmentation Based on Dual Attention and Encoder-decoder Structure 计算机科学, 2020, 47(5): 166-171. https://doi.org/10.11896/jsjkx.190400062 |
[8] | 冯晓月, 宋杰. 二维人体姿态估计研究进展 Research Advance on 2D Human Pose Estimation 计算机科学, 2020, 47(11): 128-136. https://doi.org/10.11896/jsjkx.200700061 |
[9] | 王浩,刘则芬,方宝富,陈金金. 基于约束树形图结构外观模型的人体姿态估计 Human Pose Estimation Based on Appearance Model for Constraint Tree Pictorial Structure 计算机科学, 2014, 41(3): 76-79. |
|