Computer Science ›› 2021, Vol. 48 ›› Issue (11A): 130-135.doi: 10.11896/jsjkx.201200205

• Intelligent Computing • Previous Articles     Next Articles

Joint Learning of Causality and Spatio-Temporal Graph Convolutional Network for Skeleton- based Action Recognition

YE Song-tao1, ZHOU Yang-zheng1, FAN Hong-jie2, CHEN Zheng-lei3   

  1. 1 School of Computer Science,Xiangtan University,Xiangtan,Hunan 411105,China
    2 Department of Science and Technology Teaching,China University of Political Science and Law,Beijing 102249,China
    3 Wushu Research Institute,General Administration of Sport of China,Beijing 100029,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:YE Song-tao,born in 1983,Ph.D,asso-ciate professor.His main research inte-rests include truth discovery,data analysis and mining and action recognition.
    FAN Hong-jie,born in 1984,Ph.D,lecturer.His main research interests include knowledge graphs,data exchange and data analysis and mining
  • Supported by:
    National Natural Science Foundation of China(61802327) and Natural Science Foundation of Hunan Province(2018JJ3511).

Abstract: In recent years,skeleton based human action recognition has attracted extensive attention due to its simplicity and robustness.Most of the skeleton based human action recognition methods,such as spatio-temporal graph convolutional network (ST-GCN),distinguish different actions by extracting the temporal features of consecutive frames and the spatial features of skele-ton joints within frames,achieve good results.In this paper,considering the causality of human action,we propose an action recog-nition method combining causality and spatio-temporal graph convolutional network.In view of the complexity of obtaining weight,we propose a method to calculate joint weight based on causality.According to the causality,we assign weights to skeleton graph,and use weights as auxiliary information to enhance graph convolutional network to improve the weight of some joints with strong driving force in the neural network,so as to reduce the attention of low importance joints and enhance the attention of high importance joints.Compared with ST-GCN,our methodimproves the recognition accuracy of both Top-1 and Top-5,and the recognition accuracy reaches 97.38% (Top-1) and 99.79% (Top-5) on the real TaiChi dataset,which strongly prove that our method can effectively learn and enhance the discriminative features.

Key words: Action recognition, Causality, Convergent cross mapping, Spatio-temporal graph convolutional neural network, Weight embedding

CLC Number: 

  • TP391.4
[1]STEFAN M,CRISTIAN S.Actions in the Eye:Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(7):1408-1424.
[2]RONALD P.A Survey on Vision-based Human Action Recognition[J].Image and Vision Computing,2010,28(6):976-990.
[3]SUDHA M R,SRIRAGHAV K,JACOB S G,et al.Approaches and Applications of Virtual Reality and Gesture Recognition:A review[J].International Journal of Ambient Computing and Intelligence (IJACI),2017,8(4):1-18.
[4]WANG P,LI W,OGUNBONA P,et al.RGB-D-based HumanMotion Recognition with Deep Learning:A Survey[J].ComputerVision & Image Understanding,2018,171:118-139.
[5]SONG S J,LAN C L,XING J L,et al.An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data[C]//Proceeding of Thirty-First AAAI Confe-rence on Artificial Intelligence.CA:AAAI,2017:4263-4270.
[6]LIU J,SHAHROUDY A,XU D,et al.Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition[C]//Proceeding of 14th European Conference on Computer Vision.Cham:Springer,2016:816-833.
[7]KE Q,BENNAMOUN M,AN S,et al.A New Representation of Skeleton Sequences for 3D Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:3288-3297.
[8]SI C,JING Y,WANG W,et al.Skeleton-based Action Recognition with Hierarchical Spatial Reasoning and Temporal Stack Learning Network[J].Pattern Recognition,2020,107:107511.
[9]PRESTI L L,LA CASCIA M.3D Skeleton-based Human Action Classification:A survey[J].Pattern Recognition,2016,53:130-147.
[10]FERNANDO B,GAVVES E,ORAMAS J M,et al.ModelingVideo Evolution for Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:5378-5387.
[11]YU Y,SI X,HU C,et al.A Review of Recurrent Neural Networks:LSTM Cells and Network Architectures[J].Neural Computation,2019,31(7):1235-1270.
[12]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM,2017,60(6):84-90.
[13]ZHANG P,LAN C,XING J,et al.View Adaptive RecurrentNeural Networks for High Performance Human Action Recognition from Skeleton Data[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2017:2117-2126.
[14]LUO H L,TONG K,KONG F S.The Progress of Human Action Recognition in Videos Based on Deep Learning:A Review[J].Acta Electronica Sinica,2019,47(5):1162-1173.
[15]QIAN H F,YI J P,FU Y H.Review of Human Action Recognition Based on Deep Learning.Journal of Frontiers of Computer Science and Technology,2021,15(3):438-455.
[16]HUANG H X WANG R P,LIU X Y.Review of Human Action Recognition Technology Based on 3D Convolution[J].Computer Science,2020,47(S2):139-144.
[17]NIEPERT M,AHMED M,KUTZKOV K.Learning Convolutional Neural Networks for Graphs[C]//Proceedings of the 33rd International Conference on Machine Learning.New York:PMLR,2016:2014-2023.
[18]YAN S J,XIONG Y J,LIN D H.Spatial Temporal Graph Convolutional Networks for Skeleton-based Action Recognition[C]//Proceeding of Thirty-second AAAI Conference on Artificial Intelligence.CA:AAAI,2018:7444-7452.
[19]SHI L,ZHANG Y F,CHENG J,et al.Two-Stream AdaptiveGraph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2019:12026-12035.
[20]LI M S,CHEN S H,CHEN X,et al.Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2019:3595-3603.
[21]WEN Y H,GAO L,FU H B,et al.Graph CNNs with Motif and Variable Temporal Block for Skeleton-Based Action Recognition[C]//Proceeding of Thirty-Third AAAI Conference on Artificial Intelligence.CA:AAAI,2019:8989-8996.
[22]HUSSEIN M E,TORKI M,GOWAYYED M A,et al.HumanAction Recognition Using A Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations[C]//Proceeding of the Twenty-Third International Joint Conference on Artificial Intelligence.CA:AAAI,2013:2466-2472.
[23]WANG J,LIU Z,WU Y,et al.Mining Action let Ensemble for Action Recognition with Depth Cameras[C]//Proceeding of IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2012:1290-1297.
[24]VEMULAPALLI R,ARRATE F.Human Action Recognitionby Representing 3d Skeletons As Points in A Lie Group[C]//Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2014:588-595.
[25]SHAHROUDY A,LIU J.NTU RGB+D:A Large Scale Dataset for 3D Human Activity Analysis[C]//Proceeding of the IEEE conference on Computer Vision and Pattern Recognition.NJ:IEEE,2016:1010-1019.
[26]ZHANG S,LIU X,XIAO J.On Geometric Features for Skeleton-based Action Recognition using Multilayer LSTM Networks[C]//Proceeding of IEEE Winter Conference on Applications of Computer Vision (WACV).NY:IEEE Computer Society,2017:148-157.
[27]LI C,ZHONG Q,XIE D,et al.Skeleton-based Action Recognition with Convolutional Neural Networks[C]//Proceeding of IEEE International Conference on Multimedia & Expo Workshops (ICMEW).NJ:IEEE,2017:597-600.
[28]LI B,LI X,ZHANG Z,et al.Spatio-temporal Graph Routing for Skeleton-based Action Recognition[C]//Proceeding of the AAAI Conference on Artificial Intelligence,2019(33):8561-8568.
[29]SUGIHARA G,MAY R,YE H,et al.Detecting causality incomplex ecosystems[J].Science,2012,338(6106):496-500.
[30]LIU H,LEI M,ZHANG N,et al.The Causal Nexus Between Energy Consumption,Carbon Emissions and Economic Growth:New Evidence from China,India and G7 Countries Using Convergent Cross Mapping[J].PloS one,2019,14(5):e0217319.
[31]BARRAQUAND F,PICOCHE C,DETTO M,et al.Inferring Species Interactions Using Granger Causality and Convergent Cross Mapping[J].Theoretical Ecology,2020,14:87-105.
[32]CAO Z,HIDALGO G,SIMON T,et al.OpenPose:RealtimeMulti-Person 2D Pose Estimation using Part Affinity Fields[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.NJ:IEEE,2017:7291-7299.
[33]FERNANDO B,GAVVES E,ORAMAS J M,et al.ModelingVideo Evolution for Action Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2015:5378-5387.
[34]KIM T S,REITER A.Interpretable 3d Human Action Analysis with Temporal Convolutional Networks[C]//Proceedings of 2017 IEEE conference on Computer Vision and Pattern Recognition Workshops (CVPRW).NJ:IEEE,2017:1623-1631.
[1] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[2] MIU Feng, WANG Ping, LI Tai-yong. Implicit Causality Extraction Method Based on Event Action Direction [J]. Computer Science, 2022, 49(3): 276-280.
[3] XIE Yu, YANG Rui-ling, LIU Gong-xu, LI De-yu, WANG Wen-jian. Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph [J]. Computer Science, 2022, 49(2): 62-68.
[4] MIAO Qi-guang, XIN Wen-tian, LIU Ru-yi, XIE Kun, WANG Quan, YANG Zong-kai. Graph Convolutional Skeleton-based Action Recognition Method for Intelligent Behavior Analysis [J]. Computer Science, 2022, 49(2): 156-161.
[5] QIAO Jie, CAI Rui-chu, HAO Zhi-feng. Mining Causality via Information Bottleneck [J]. Computer Science, 2022, 49(2): 198-203.
[6] GAN Chuang, WU Gui-xing, ZHAN Qing-yuan, WANG Peng-kun, PENG Zhi-lei. Multi-scale Gated Graph Convolutional Network for Skeleton-based Action Recognition [J]. Computer Science, 2022, 49(1): 181-186.
[7] LIU Xin, YUAN Jia-bin, WANG Tian-xing. Interior Human Action Recognition Method Based on Prior Knowledge of Scene [J]. Computer Science, 2022, 49(1): 225-232.
[8] PEI Ying, LI Tian-xiang, WANG Ao-qing, FU Jia-sheng, HAN Xiao-song. Prediction Method of International Natural Gas Price Trends Based on News [J]. Computer Science, 2021, 48(6A): 235-239.
[9] ZHONG Yue, FANG Hu-sheng, ZHANG Guo-yu, WANG Zhao, ZHU Jing-wei. Method of CNN Flag Movement Recognition Based on 9-axis Attitude Sensor [J]. Computer Science, 2021, 48(6): 153-158.
[10] HONG Yao-qiu. Visual Human Action Recognition Based on Deep Belief Network [J]. Computer Science, 2021, 48(11A): 400-403.
[11] HE Xin, XU Juan, JIN Ying-ying. Action-related Network:Towards Modeling Complete Changeable Action [J]. Computer Science, 2020, 47(9): 123-128.
[12] HE Lei, SHAO Zhan-peng, ZHANG Jian-hua and ZHOU Xiao-long. Review of Deep Learning-based Action Recognition Algorithms [J]. Computer Science, 2020, 47(6A): 139-147.
[13] CAI Qiang, DENG Yi-biao, LI Hai-sheng, YU Le, MING Shao-feng. Survey on Human Action Recognition Based on Deep Learning [J]. Computer Science, 2020, 47(4): 85-93.
[14] HUANG Hai-xin, WANG Rui-peng, LIU Xiao-yang. Review of Human Action Recognition Technology Based on 3D Convolution [J]. Computer Science, 2020, 47(11A): 139-144.
[15] MAO Xia, WANG Lan, LI Jian-jun. Human Action Recognition Framework with RGB-D Features Fusion [J]. Computer Science, 2018, 45(8): 22-27.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!