二维人体姿态估计研究进展

doi:10.11896/jsjkx.200700061

摘要/Abstract

摘要： 人体姿态估计一直是计算机视觉领域的研究热点,随着人体姿态估计方法的性能和精度不断提升,目前可以广泛应用于人机交互、智能监控和人体活动分析等领域。人体姿态估计属于强应用相关的研究领域,现有研究成果均不同程度地涉及方法、模型和应用层面,亟待对其进行系统性归纳和总结。文中综述了大量二维人体姿态估计的研究成果,以供研究人员参考。具体包括:单人和多人姿态估计方法,基于ResNet,Hourglass和HRNet的姿态估计模型,以及姿态估计在人机交互和智能监控领域的应用。文中提出的关于移动设备中的人体姿态估计、拥挤场景下的人体姿态估计和装备人群的姿态估计等研究问题和研究思路,是现有研究的良好补充,为研究人员提供了广阔的研究空间。

关键词: Hourglass, HRNet, ResNet, 关键点检测, 人体姿态估计, 神经网络

Abstract: Human pose estimation has always been a research hotspot in the field of computer vision.With the continuous improvement of the performance and accuracy of human pose estimation methods,it can be widely used in human-computer interaction,intelligent surveillance and human activity analysis,etc.In this paper,the methods,models and applications of two-dimensional human pose estimation are reviewed and analyzed,and the future research direction is prospected.The introduction of the method is divided into single person and multi-person pose estimation.In terms of the model,it mainly introduces the models based on ResNet,Hourglass and HRNet.In terms of the application,it mainly introduces the application in the field of human-computer interaction and intelligent surveillance.The research prospect is mainly aimed at the expansion of application scenarios.This paper summarizes the research results in recent years and sorts out the possible research directions.

Key words: Hourglass, HRNet, Human pose estimation, Key-point detection, Neural network, ResNet

中图分类号:

TP311

冯晓月, 宋杰. 二维人体姿态估计研究进展[J]. 计算机科学, 2020, 47(11): 128-136. https://doi.org/10.11896/jsjkx.200700061

FENG Xiao-yue, SONG Jie. Research Advance on 2D Human Pose Estimation[J]. Computer Science, 2020, 47(11): 128-136. https://doi.org/10.11896/jsjkx.200700061

参考文献

[1] HEN C H,RAMANAN D.3D Human Pose Estimation＝2DPose Estimation＋Matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7035-7043.
[2] LI X H,LIU J F.A Review of the Research on Two-dimensional Human Posture Estimation[J].Modern Computer,2019(22):33-37.
[3] FISCHLER M A,ELSCHLAGER R A.The Representation and Matching of Pictorial Structures[J].IEEE Transactions on Computers,1973,22(1):67-92.
[4] ANDRILUKA M,ROTH S,SCHIELE B,et al.Pictorial structures revisited:People detection and articulated pose estimation[C]//2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2009:1014-1021.
[5] DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2005:886-893.
[6] LOWE D G.Distinctive Image Features from Scale-InvariantKeypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[7] NAGELI T,OBERHOLZER S,PLUSS S,et al.Flycon:real-time environment-independent multi-view human pose estimation with aerial vehicles[C]//International Conference on Computer Graphics and Interactive Techniques.2019.
[8] ACHILLES F,ICHIM A E,COSKUN H,et al.Patient MoCap:Human Pose Estimation Under Blanket Occlusion for Hospital Monitoring Applications[C]//Medical Image Computing and Computer Assisted Intervention.2016:491-499.
[9] WANG J,QIU K,PENG H,et al.AI Coach:Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance[C]//ACM Multimedia.2019:2228-2230.
[10] YANG Y,RAMANAN D.Articulated pose estimation withflexible mixtures-of-parts[C]//The 24th IEEE Conference on Computer Vision and Pattern Recognition.2011:1385-1392.
[11] HE K,ZHANG X,REN S,et al.Delving Deep into Rectifiers:Surpassing Human-Level Performance on ImageNet Classification[C]//International Conference on Computer Vision.2015:1026-1034.
[12] GLOROT X,BENGIO Y.Understanding the difficulty of training deep feedforward neural networks[C]//International Conference on Artificial Intelligence and Statistics.2010:249-256.
[13] SAPP B,TASKAR B.MODEC:Multimodal DecomposableModels for Human Pose Estimation[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition.2013:3674-3681.
[14] GKIOXARI G,ARBELAEZ P,BOURDEV L,et al.Articulated Pose Estimation Using Discriminative Armlet Classifiers[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition.2013:3342-3349.
[15] SAPP B,JORDAN C,TASKAR B,et al.Adaptive pose priors for pictorial structures[C]//Computer Vision and Pattern Recognition.2010:422-429.
[16] Dantone M,Gall J,Leistner C,et al.Human Pose Estimation Using Body Parts Dependent Joint Regressors[C]//Computer Vision and Pattern Recognition.2013:3041-3048.
[17] PISHCHULIN L,ANDRILUKA M,GEHLER P V,et al.Poselet Conditioned Pictorial Structures[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition.2013:588-595.
[18] JOHNSON S,EVERINGHAM M.Learning effective humanpose estimation from inaccurate annotation[C]//The 24th IEEE Conference on Computer Vision and Pattern Recognition.2011:1465-1472.
[19] TOSHEV A,SZEGEDY C.DeepPose:Human Pose Estimation via Deep Neural Networks[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:1653-1660.
[20] TOMPSON J,JAIN A,LECUN Y,et al.Joint Training of aConvolutional Network and a Graphical Model for Human Pose Estimation[C]//Neural Information Processing Systems.2014:1799-1807.
[21] WEI S,RAMAKRISHNA V,KANADE T,et al.Convolutional Pose Machines[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.2016:4724-4732.
[22] LONG J,SHELHAMER E,DARRELL T,et al.Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[23] NEWELL A,YANG K,DENG J,et al.Stacked hourglass networks for human pose estimation[C]//European Conference on Computer Vision.2016:483-499.
[24] LONG J,SHELHAMER E,DARRELL T,et al.Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[25] ZEILER M D,TAYLOR G W,FERGUS R,et al.Adaptive deconvolutional networks for mid and high level feature learning[C]//International Conference on Computer Vision.2011:2018-2025.
[26] PAPANDREOU G,ZHU T,KANAZAWA N,et al.Towards Accurate Multi-person Pose Estimation in the Wild[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.2017:3711-3719.
[27] HE K,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[C]//International Conference on Computer Vision.2017:2980-2988.
[28] FANG H,XIE S,TAI Y,et al.RMPE:Regional Multi-personPose Estimation[C]//International Conference on Computer Vision.2017:2353-2362.
[29] XIAO B,WU H,WEI Y,et al.Simple Baselines for Human Pose Estimation and Tracking[C]//European Conference on Computer Vision.2018:472-487.
[30] CAO Z,SIMON T,WEI S,et al.Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.2017:1302-1310.
[31] PISHCHULIN L,INSAFUTDINOV E,TANG S,et al.DeepCut:Joint Subset Partition and Labeling for Multi Person Pose Estimation[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.2016:4929-4937.
[32] NEWELL A,HUANG Z,DENG J,et al.Associative Embedding:End-to-End Learning for Joint Detection and Grouping[C]//Neural Information Processing Systems.2017:2277-2287.
[33] INSAFUTDINOV E,PISHCHULIN L,ANDRES B,et al.DeeperCut:A Deeper,Stronger,and Faster Multi-Person Pose Estimation Model[C]//European Conference on Computer Vision.2016:34-50.
[34] HE K,ZHANG X,REN S,et al.Deep Residual Learning for Ima-ge Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[35] NING G,HE Z.Dual Path Networks for Multi-Person Human Pose Estimation[J].arXiv:1710.10192.
[36] CAO Z,SIMON T,WEI S,et al.Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.2017:1302-1310.
[37] CHEN Y,LI J,XIAO H,et al.Dual Path Networks[J].arXiv:1707.01629.
[38] HUANG G,LIU Z,DER MAATEN L V,et al.Densely Connected Convolutional Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.2017:2261-2269.
[39] XIE S,GIRSHICK R,DOLLAR P,et al.Aggregated ResidualTransformations for Deep Neural Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.2017:5987-5995.
[40] CHEN Y,WANG Z,PENG Y,et al.Cascaded Pyramid Network for Multi-person Pose Estimation[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition.2018:7103-7112.
[41] MARTIEZ G A,VILLAMIZAR M,CANÉVET O,et al.Real-time Convolutional Networks for Depth-based Human Pose Estimation[C]//Intelligent Robots and Systems.2018:41-47.
[42] LUO D,DU S,IKENAGA T,et al.End-to-End Feature Pyramid Network for Real-Time Multi-Person Pose Estimation[C]//International Conference on Machine Vision.2019:1-4.
[43] ZHANG Z,TANG J,WU G,et al.Simple and Lightweight Human Pose Estimation[J].arXiv:1911.10346.
[44] YANG W,LI S,OUYANG W,et al.Learning Feature Pyramids for Human Pose Estimation[C]//International Conference on Computer Vision.2017:1290-1299.
[45] NIE X,FENG J,XING J,et al.Generative Partition Networks for Multi-Person Pose Estimation[J].arXiv:1705.07422.
[46] NIE X,FENG J,XING J,et al.Pose Partition Networks forMulti-person Pose Estimation[C]//European Conference on Computer Vision.2018:705-720.
[47] ZHAO Y,LUO Z,QUAN C,et al.Lite Hourglass Network for Multi-person Pose Estimation[C]//MultiMedia Modeling - 26th International Conference.2020:226-238.
[48] LUO Y,XU Z,LIU P,et al.Multi-Person Pose Estimation via Multi-Layer Fractal Network and Joints Kinship Pattern[J].IEEE Transactions on Image Processing,2019,28(1):142-155.
[49] LUO Y,XU Z,LIU P,et al.Combining fractal hourglass network and skeleton joints pairwise affinity for multi-person pose estimation[J].Multimedia Tools and Applications,2019,78(6):7341-7363.
[50] ZHAO Y,LUO Z W,QUAN C Q,et al.Cluster-wise Learning Network for Multi-person Pose Estimation[J].Pattern Recognition,2020,98(2):107074.
[51] SUN K,XIAO B,LIU D,et al.Deep High-Resolution Representation Learning for Human Pose Estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:5693-5703.
[52] CHENG B,XIAO B,WANG J,et al.Bottom-up Higher-Resolution Networks for Multi-Person Pose Estimation[J].arXiv:1908.10357.
[53] ZHANG K,HE P,YAO P,et al.DNANet:De-Normalized Attention Based Multi-Resolution Network for Human Pose Estimation[J].arXiv:1909.05090.
[54] WANG X,CAO Z,WANG R,et al.Improving Human Pose Estimation With Self-Attention Generative Adversarial Networks[J].IEEE Access,2019:119668-119680.
[55] RADWAN I,MOUSTAFA N,KEATING B,et al.Hierarchical Adversarial Network for Human Pose Estimation[J].IEEE Access,2019:103619-103628.
[56] TANG B,FAN Q R,SUN K X,et al.Application of Human Pose Recognition Algorithm in Visual Human-computerInteraction[J].Computer Measurement and Control,2019,27(7):242-247.
[57] TANG X Y,SONG A G.Human Pose Estimation and Its Application in Rehabilitation Training Situational Interaction[J].Journal of Instrumentation,2018,39(11):195-203.
[58] ZENG L Z.Intelligent Campus Management System[J].Communication World,2018(8):309-310.
[59] SONG X Y.Research and Implementation of Abnormal Behavior Identification Technology in Prison Intelligent Monitoring System[D].Nanjing:Nanjing University of Posts and Telecommunications,2013.
[60] LI W.Design of Intelligent Video Surveillance System for Elderly Apartments[D].Huaqiao:Huaqiao University,2017.
[61] ZHOU P X.Design and Implementation of Queue Scoring System Based on Human Pose Estimation[D].University of Electronic Science and Technology,2019.
[62] XIA P.Pedestrian Pose Estimation for Active Safety of Intelligent Vehicle[D].University of Electronic Science and Technology,2019.
[63] LIU H J,ZHOU D M.User Preference Analysis System Based on Human Pose Estimation[J].Tianjin science and technology,2019,46(4):53-56.

相关文章 15

[1]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3]	宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[4]	李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[5]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6]	王润安, 邹兆年. 基于物理操作级模型的查询执行时间预测方法 Query Performance Prediction Based on Physical Operation-level Models 计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[7]	陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[8]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9]	檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[10]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11]	齐秀秀, 王佳昊, 李文雄, 周帆. 基于概率元学习的矩阵补全预测融合算法 Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning 计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[12]	杨炳新, 郭艳蓉, 郝世杰, 洪日昌. 基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用 Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition 计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
[13]	张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[14]	戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[15]	刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed