计算机科学 ›› 2021, Vol. 48 ›› Issue (4): 138-143.doi: 10.11896/jsjkx.200300042

• 计算机图形学&多媒体 • 上一篇    下一篇

基于骨骼关键点检测的多人行为识别

李梦荷, 许宏吉, 石磊鑫, 赵文杰, 李娟   

  1. 山东大学信息科学与工程学院 山东 青岛266237
  • 收稿日期:2020-06-24 修回日期:2020-08-02 出版日期:2021-04-15 发布日期:2021-04-09
  • 通讯作者: 许宏吉(hongjixu@sdu.edu.cn)
  • 基金资助:
    国家重点研发计划子课题(2018YFC0831001);国家自然科学基金(61771292);山东省教育科学"十三五"规划课题(YZ2019070)

Multi-person Activity Recognition Based on Bone Keypoints Detection

LI Meng-he, XU Hong-ji, SHI Lei-xin, ZHAO Wen-jie, LI Juan   

  1. School of Information Science and Engineering,Shandong University,Qingdao,Shandong 266237,China
  • Received:2020-06-24 Revised:2020-08-02 Online:2021-04-15 Published:2021-04-09
  • About author:LI Meng-he,born in 1994,postgra-duate.Her main research interests include computer vision and artificial intelligence.(limenghe0309@163.com)
    XU Hong-ji,born in 1976,Ph.D,asso-ciate professor.His main research intere-sts include wireless communications,ubiquitous computing,intelligent perception,blind signal processing and artificial intelligence.
  • Supported by:
    National Key Research and Development Program of China(2018YFC0831001),National Natural Science Foundation of China(61771292) and the 13th Five-Year Plan on Education Science of Shandong Province(YZ2019070).

摘要: 人体行为识别(Human Activity Recognition,HAR)技术是计算机视觉领域的研究热点,目前多人HAR的研究仍存在很多技术难点。针对多人HAR中人数判断不准确、特征提取难度大导致行为识别准确率低的问题,提出了一种基于骨骼关键点检测的多人行为识别系统。该系统将骨骼点提取与动作识别相结合,首先对原始视频进行图像帧提取,然后通过OpenPose算法得到人体骨骼关键点数据来对人体进行检测并标注,最后根据骨骼点的特点提取人体姿态特征。同时,为准确描述特征之间的关系,提出了一种基于帧窗口矩阵的特征描述方法,该方法将支持向量机(Support Vector Machine,SVM)作为分类器以完成多人行为识别。选择UT-Interaction和HMDB51这两个公开的数据集中的10类日常典型行为作为测试对象,实验结果表明,所提方法可以有效提取图像中的多人骨骼关键点信息,且其对10类日常典型行为的平均识别准确率达86.25%,优于对比的其他已有方法。

关键词: OpenPose算法, SVM分类器, 骨骼关键点提取, 姿态特征提取

Abstract: Human activity recognition(HAR) technology is a research hotspot in the field of computer vision,but there are still many technical difficulties in the research of multi-person HAR.The problem of the inaccurate judgment of the number of people and the difficulty of feature extraction in multi-person activity recognition may lead to the low accuracy.A multi-person activity recognition system based on bone keypoints detection is proposed in this paper,which combines the extraction of bone points with the action recognition.Firstly,the image frame is extracted from the original video.Secondly,the OpenPose algorithm is used to obtain keypoints data of the human skeleton to detect the number of people in the image and mark activity information.At last,human posture features are extracted according to characteristics of skeleton points.Meanwhile,in order to accurately describe the relationship between posture features,a feature description method based on frame window matrix is proposed.Finally,a support vector machine(SVM) is used as a classifier to complete multi-person activity recognition.10 types of daily typical activities from UT-Interaction and HMDB51 datasets are taken as test objects,and experimental results prove that the proposed method can effectively extract keypoints of multiple human bones in the image.Its average recognition accuracy of 10 activities is 86.25%,which is higher than other compared methods.

Key words: OpenPose algorithm, Posture feature extraction, Skeleton keypoints extraction, SVM Classifier

中图分类号: 

  • TP391
[1]LIU A A,SU Y T,JIA P P,et al.Multiple/Single-View HumanAction Recognition via Part-Induced Multitask Structural Learning [J].IEEE Transactions on Cybernetics,2015,45(6):1194-1208.
[2]GONG W.Design and Implementation of Student Learning Behavior Recognition System Based on Skeleton Keypoint Detection [D].Changchun:Jilin University,2019.
[3]DAWAR N,KEHTARNAVAZ N.Action Detection and Recognition inContinuous Action Stream by Deep Learning-Based Sensing Fusion [J].IEEE Sensors Journal,2018,18(23):9660-9668.
[4]CHENG J,LIU H J,WANG F,et al.Silhouette Analysis for Human Action Recognition Based on Supervised Temporal T-SNE and Incremental Learning [J].IEEE Transactions on Image Processing,2015,24(10):3203-3217.
[5]LIU A A,XU N,NIE W Z,et al.Multi-Domain and Multi-Task Learning for Human Action Recognition[J].IEEE Transactions on Image Processing,2019,28(2):853-867.
[6]SUN J F,XU H J,ZHOU Y M,et al.Human Actions Recognition Using Improved MHI and 2-D Gabor Filter Based on Energy Blocks [C]//2018 International Conference on Artificial Intelligence:Technologies and Applications(ICAITA2018).Chengdu:Atlantis Press,2018:1-4.
[7]TU Z G,LU H Y,ZHANG D J,et al.Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition [J].IEEE Transactions on Image Processing,2019,28(6):2799-2812.
[8]BAGAUTDINOV T,ALAHI A,FLEURET F,et al.SocialScene Understanding:End-to-End Multi-Person Action Localization and Collective Activity Recognition [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).Hawaii:IEEE,2017:3425-3434.
[9]ZHOU Q Q,ZHONG B N.Deep Alignment Network BasedMulti-Person Tracking with Occlusion and Motion Reasoning [J].IEEE Transactions on Multimedia.2019,21(5):1183-1194.
[10]LI M P,ZHOU Z M,et al.Multi-Person Pose Estimation Using Bounding Box Constraint and LSTM [J].IEEE Transactions on Multimedia.2019,21(10):2653-2263.
[11]LIN L,WANG Y F,et al.Multi-Person Pose Estimation Using Aurous Convolution [J].Electronics Letters.2019,55(9):533-535.
[12]CHEN X,YANG G K.Multi-Person Pose Estimation withLIMB Detection Heatmaps [C]//2018 IEEE International Conference on Image Processing(ICIP 2018).Athens:IEEE,2018:4078-4082.
[13]ANDRILUKA M,ROTH S,SCHIELE B.Pictorial StructuresRevisited:People Detection and Articulated Pose Estimation [C]//2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2009).Miami,FL:IEEE,2009:1014-1021.
[14]CAO Z,SIMON T,WEI S E,et al.Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).Hawaii:IEEE,2017:1302-1310.
[15]CAO Z,HIDALGO G,SIMON T.OpenPose:Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [C]//2019 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2019).Hawaii:IEEE,2019:1-14.
[16]RYOO M S,AGGARWAL J K.Spatio Temporal Relationship Match:Video Structure Comparison for Recognition of Complex Human Activities [C]//2009 IEEE International Conference on Computer Vision(CVPR 2009).2009:1593-1600.
[17]YAN S J,XIONG Y J,LIN D H.Spatial Temporal Graph Con-volutional Networks for Skeleton Based Action Recognition [C]//2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2018).Salt Lake City:IEEE,2018:7444-7452.
[18]CARREIRAL J,ZISSENRMAN A.Quo Vadis:Action Recognition? A New Model and the Kinetics Dataset [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).Hawaii:IEEE,2017:4724-4733.
[19]CHOUTAS V,WEINZAEPFEL P,REVAUD J.Potion:Pose-Motion Representation for Action Recognition [C]//2018 Conference on Computer Vision and Pattern Recognition(CVPR 2018).Salt Lake City:IEEE,2018:7024-7033.
[20]WANG L,KONIUSZ P,HUYNH D Q.Hallucinating IDT Descriptors and I3D Optical Feature for Action Recognition with CNNs [C]//2019 IEEE International Conference on Computer Vision(ICCV 2019).Seoul:IEEE,2019:1-12.
[1] 赵澄, 陈君新, 姚明海.
基于SVM分类器的XSS攻击检测技术
XSS Attack Detection Technology Based on SVM Classifier
计算机科学, 2018, 45(11A): 356-360.
[2] 李昆仑,张亚欣,刘利利,耿雪菲.
基于改进PCA和支持向量机的掌纹识别
Palmprint Recognition Based on Improved PCA and SVM
计算机科学, 2015, 42(Z11): 146-150.
[3] 申铉京,李梦臻,吕颖达,陈海鹏.
基于LBC的计算机生成图像盲鉴别算法
Blind Identification Algorithm of Photorealistic Computer Graphics Based on Local Binary Count
计算机科学, 2015, 42(6): 135-138. https://doi.org/10.11896/j.issn.1002-137X.2015.06.030
[4] 刘纯利,张弓.
物体边沿特征提取及应用
Grain Classification Based on Edge Feature
计算机科学, 2013, 40(7): 280-282.
[5] 张永,薛芝茂.
基于两级分类器的人脸检测系统设计
Face Detection System Design Based on Two Classifiers
计算机科学, 2010, 37(4): 293-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!