计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 130-135.doi: 10.11896/jsjkx.201200205

• 智能计算 • 上一篇    下一篇

融合因果关系和时空图卷积网络的人体动作识别

叶松涛1, 周扬正1, 范红杰2, 陈正雷3   

  1. 1 湘潭大学计算机学院 湖南 湘潭411105
    2 中国政法大学科学技术教学部 北京102249
    3 国家体育总局武术研究院 北京100029
  • 出版日期:2021-11-10 发布日期:2021-11-12
  • 通讯作者: 范红杰(hjfan@cupl.edu.cn)
  • 作者简介:yesongtao@xtu.edu.cn
  • 基金资助:
    国家自然科学基金(61802327);湖南省自然科学基金(2018JJ3511)

Joint Learning of Causality and Spatio-Temporal Graph Convolutional Network for Skeleton- based Action Recognition

YE Song-tao1, ZHOU Yang-zheng1, FAN Hong-jie2, CHEN Zheng-lei3   

  1. 1 School of Computer Science,Xiangtan University,Xiangtan,Hunan 411105,China
    2 Department of Science and Technology Teaching,China University of Political Science and Law,Beijing 102249,China
    3 Wushu Research Institute,General Administration of Sport of China,Beijing 100029,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:YE Song-tao,born in 1983,Ph.D,asso-ciate professor.His main research inte-rests include truth discovery,data analysis and mining and action recognition.
    FAN Hong-jie,born in 1984,Ph.D,lecturer.His main research interests include knowledge graphs,data exchange and data analysis and mining
  • Supported by:
    National Natural Science Foundation of China(61802327) and Natural Science Foundation of Hunan Province(2018JJ3511).

摘要: 基于人体骨骼的动作识别因具有简洁、鲁棒的特点,近年来受到了广泛的关注。目前大部分基于骨骼的动作识别方法,如时空图卷积网络(ST-GCN),通过提取连续帧的时间特征和帧内骨骼关节的空间特征来区分不同的动作,取得了良好的效果。考虑人体运动中存在的因果性关系,提出了一种融合因果关系和时空图卷积网络的动作识别方法。针对计算关节力矩获取权重复杂的情况,根据关节之间的因果关系为骨骼图分配边权重,并将权重作为辅助信息增强图卷积网络,来提高驱动力较强的关节在神经网络中的权重,降低重要性低的关节的关注度,增强重要性高的关节的关注度。相比ST-GCN等方法,在Kinetics公开数据集上,所提方法无论是Top-1还是Top-5都有较大的提升,在构建的真实太极拳数据集上的识别精度达97.38%(Top-1)和99.79%(Top-5),证明了该方法可以有效地增强动作特征,提升识别的准确率。

关键词: 动作识别, 权重嵌入, 时空图卷积网络, 收敛交叉映射, 因果关系

Abstract: In recent years,skeleton based human action recognition has attracted extensive attention due to its simplicity and robustness.Most of the skeleton based human action recognition methods,such as spatio-temporal graph convolutional network (ST-GCN),distinguish different actions by extracting the temporal features of consecutive frames and the spatial features of skele-ton joints within frames,achieve good results.In this paper,considering the causality of human action,we propose an action recog-nition method combining causality and spatio-temporal graph convolutional network.In view of the complexity of obtaining weight,we propose a method to calculate joint weight based on causality.According to the causality,we assign weights to skeleton graph,and use weights as auxiliary information to enhance graph convolutional network to improve the weight of some joints with strong driving force in the neural network,so as to reduce the attention of low importance joints and enhance the attention of high importance joints.Compared with ST-GCN,our methodimproves the recognition accuracy of both Top-1 and Top-5,and the recognition accuracy reaches 97.38% (Top-1) and 99.79% (Top-5) on the real TaiChi dataset,which strongly prove that our method can effectively learn and enhance the discriminative features.

Key words: Action recognition, Causality, Convergent cross mapping, Spatio-temporal graph convolutional neural network, Weight embedding

中图分类号: 

  • TP391.4
[1]STEFAN M,CRISTIAN S.Actions in the Eye:Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(7):1408-1424.
[2]RONALD P.A Survey on Vision-based Human Action Recognition[J].Image and Vision Computing,2010,28(6):976-990.
[3]SUDHA M R,SRIRAGHAV K,JACOB S G,et al.Approaches and Applications of Virtual Reality and Gesture Recognition:A review[J].International Journal of Ambient Computing and Intelligence (IJACI),2017,8(4):1-18.
[4]WANG P,LI W,OGUNBONA P,et al.RGB-D-based HumanMotion Recognition with Deep Learning:A Survey[J].ComputerVision & Image Understanding,2018,171:118-139.
[5]SONG S J,LAN C L,XING J L,et al.An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data[C]//Proceeding of Thirty-First AAAI Confe-rence on Artificial Intelligence.CA:AAAI,2017:4263-4270.
[6]LIU J,SHAHROUDY A,XU D,et al.Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition[C]//Proceeding of 14th European Conference on Computer Vision.Cham:Springer,2016:816-833.
[7]KE Q,BENNAMOUN M,AN S,et al.A New Representation of Skeleton Sequences for 3D Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:3288-3297.
[8]SI C,JING Y,WANG W,et al.Skeleton-based Action Recognition with Hierarchical Spatial Reasoning and Temporal Stack Learning Network[J].Pattern Recognition,2020,107:107511.
[9]PRESTI L L,LA CASCIA M.3D Skeleton-based Human Action Classification:A survey[J].Pattern Recognition,2016,53:130-147.
[10]FERNANDO B,GAVVES E,ORAMAS J M,et al.ModelingVideo Evolution for Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:5378-5387.
[11]YU Y,SI X,HU C,et al.A Review of Recurrent Neural Networks:LSTM Cells and Network Architectures[J].Neural Computation,2019,31(7):1235-1270.
[12]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM,2017,60(6):84-90.
[13]ZHANG P,LAN C,XING J,et al.View Adaptive RecurrentNeural Networks for High Performance Human Action Recognition from Skeleton Data[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2017:2117-2126.
[14]LUO H L,TONG K,KONG F S.The Progress of Human Action Recognition in Videos Based on Deep Learning:A Review[J].Acta Electronica Sinica,2019,47(5):1162-1173.
[15]QIAN H F,YI J P,FU Y H.Review of Human Action Recognition Based on Deep Learning.Journal of Frontiers of Computer Science and Technology,2021,15(3):438-455.
[16]HUANG H X WANG R P,LIU X Y.Review of Human Action Recognition Technology Based on 3D Convolution[J].Computer Science,2020,47(S2):139-144.
[17]NIEPERT M,AHMED M,KUTZKOV K.Learning Convolutional Neural Networks for Graphs[C]//Proceedings of the 33rd International Conference on Machine Learning.New York:PMLR,2016:2014-2023.
[18]YAN S J,XIONG Y J,LIN D H.Spatial Temporal Graph Convolutional Networks for Skeleton-based Action Recognition[C]//Proceeding of Thirty-second AAAI Conference on Artificial Intelligence.CA:AAAI,2018:7444-7452.
[19]SHI L,ZHANG Y F,CHENG J,et al.Two-Stream AdaptiveGraph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2019:12026-12035.
[20]LI M S,CHEN S H,CHEN X,et al.Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2019:3595-3603.
[21]WEN Y H,GAO L,FU H B,et al.Graph CNNs with Motif and Variable Temporal Block for Skeleton-Based Action Recognition[C]//Proceeding of Thirty-Third AAAI Conference on Artificial Intelligence.CA:AAAI,2019:8989-8996.
[22]HUSSEIN M E,TORKI M,GOWAYYED M A,et al.HumanAction Recognition Using A Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations[C]//Proceeding of the Twenty-Third International Joint Conference on Artificial Intelligence.CA:AAAI,2013:2466-2472.
[23]WANG J,LIU Z,WU Y,et al.Mining Action let Ensemble for Action Recognition with Depth Cameras[C]//Proceeding of IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2012:1290-1297.
[24]VEMULAPALLI R,ARRATE F.Human Action Recognitionby Representing 3d Skeletons As Points in A Lie Group[C]//Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2014:588-595.
[25]SHAHROUDY A,LIU J.NTU RGB+D:A Large Scale Dataset for 3D Human Activity Analysis[C]//Proceeding of the IEEE conference on Computer Vision and Pattern Recognition.NJ:IEEE,2016:1010-1019.
[26]ZHANG S,LIU X,XIAO J.On Geometric Features for Skeleton-based Action Recognition using Multilayer LSTM Networks[C]//Proceeding of IEEE Winter Conference on Applications of Computer Vision (WACV).NY:IEEE Computer Society,2017:148-157.
[27]LI C,ZHONG Q,XIE D,et al.Skeleton-based Action Recognition with Convolutional Neural Networks[C]//Proceeding of IEEE International Conference on Multimedia & Expo Workshops (ICMEW).NJ:IEEE,2017:597-600.
[28]LI B,LI X,ZHANG Z,et al.Spatio-temporal Graph Routing for Skeleton-based Action Recognition[C]//Proceeding of the AAAI Conference on Artificial Intelligence,2019(33):8561-8568.
[29]SUGIHARA G,MAY R,YE H,et al.Detecting causality incomplex ecosystems[J].Science,2012,338(6106):496-500.
[30]LIU H,LEI M,ZHANG N,et al.The Causal Nexus Between Energy Consumption,Carbon Emissions and Economic Growth:New Evidence from China,India and G7 Countries Using Convergent Cross Mapping[J].PloS one,2019,14(5):e0217319.
[31]BARRAQUAND F,PICOCHE C,DETTO M,et al.Inferring Species Interactions Using Granger Causality and Convergent Cross Mapping[J].Theoretical Ecology,2020,14:87-105.
[32]CAO Z,HIDALGO G,SIMON T,et al.OpenPose:RealtimeMulti-Person 2D Pose Estimation using Part Affinity Fields[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.NJ:IEEE,2017:7291-7299.
[33]FERNANDO B,GAVVES E,ORAMAS J M,et al.ModelingVideo Evolution for Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2015:5378-5387.
[34]KIM T S,REITER A.Interpretable 3d Human Action Analysis with Temporal Convolutional Networks[C]//Proceedings of 2017 IEEE conference on Computer Vision and Pattern Recognition Workshops (CVPRW).NJ:IEEE,2017:1623-1631.
[1] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[2] 缪峰, 王萍, 李太勇.
基于事件动作方向的隐式因果关系抽取方法
Implicit Causality Extraction Method Based on Event Action Direction
计算机科学, 2022, 49(3): 276-280. https://doi.org/10.11896/jsjkx.211100249
[3] 解宇, 杨瑞玲, 刘公绪, 李德玉, 王文剑.
基于动态拓扑图的人体骨架动作识别算法
Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph
计算机科学, 2022, 49(2): 62-68. https://doi.org/10.11896/jsjkx.210900059
[4] 乔杰, 蔡瑞初, 郝志峰.
一种基于信息瓶颈的因果关系挖掘方法
Mining Causality via Information Bottleneck
计算机科学, 2022, 49(2): 198-203. https://doi.org/10.11896/jsjkx.210100053
[5] 干创, 吴桂兴, 詹庆原, 王鹏焜, 彭志磊.
基于骨架模态的多级门控图卷积动作识别网络
Multi-scale Gated Graph Convolutional Network for Skeleton-based Action Recognition
计算机科学, 2022, 49(1): 181-186. https://doi.org/10.11896/jsjkx.201100164
[6] 钟岳, 方虎生, 张国玉, 王钊, 朱经纬.
基于9轴姿态传感器的CNN旗语动作识别方法
Method of CNN Flag Movement Recognition Based on 9-axis Attitude Sensor
计算机科学, 2021, 48(6): 153-158. https://doi.org/10.11896/jsjkx.200500005
[7] 洪耀球.
基于深度信念网络的视觉人体动作识别
Visual Human Action Recognition Based on Deep Belief Network
计算机科学, 2021, 48(11A): 400-403. https://doi.org/10.11896/jsjkx.210200079
[8] 姚宁, 苗夺谦, 张志飞.
因果信息在不同粒度上的迁移性
Transportability of Causal Information Across Different Granularities
计算机科学, 2019, 46(2): 178-186. https://doi.org/10.11896/j.issn.1002-137X.2019.02.028
[9] 黄一龙, 李培峰, 朱巧明.
事件因果与时序关系识别的联合推理模型
Joint Model of Events’ Causal and Temporal Relations Identification
计算机科学, 2018, 45(6): 204-207. https://doi.org/10.11896/j.issn.1002-137X.2018.06.036
[10] 景陈勇,詹永照,姜震.
基于混合式协同训练的人体动作识别算法研究
Research on Action Recognition Algorithm Based on Hybrid Cooperative Training
计算机科学, 2017, 44(7): 275-278. https://doi.org/10.11896/j.issn.1002-137X.2017.07.049
[11] 崔阳,刘长红.
基于约束网络的因果关联规则挖掘研究
Research on Causal Association Rule Mining Based on Constraint Network
计算机科学, 2016, 43(Z11): 466-468. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.104
[12] 张德平,刘国强,张柯.
基于GMDH因果关系的软件缺陷预测模型
Software Defect Prediction Model Based on GMDH Causal Relationship
计算机科学, 2016, 43(7): 171-176. https://doi.org/10.11896/j.issn.1002-137X.2016.07.031
[13] 任伟.
基于运动传感器的远程健康监护系统研究
Research on Remote Health Monitoring System Based on Motion Sensor
计算机科学, 2011, 38(11): 245-247.
[14] 马楠 杨炳儒 鲍泓 郭建威.
模糊认知图研究进展
Research on Progress of Fuzzy Cognitive Map
计算机科学, 2011, 38(10): 23-28.
[15] 王泽平 秦拯.
因果告警相关方法在入侵检测系统中的应用与实现

计算机科学, 2008, 35(6): 280-282.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!