计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 156-161.doi: 10.11896/jsjkx.220100061

• 计算机视觉:理论与应用 • 上一篇    下一篇


苗启广, 辛文天, 刘如意, 谢琨, 王泉, 杨宗凯   

  1. 西安电子科技大学计算机科学与技术学院 西安710071
  • 收稿日期:2022-01-06 修回日期:2022-01-09 出版日期:2022-02-15 发布日期:2022-02-23
  • 通讯作者: 刘如意(ruyiliu@xidian.edu.cn)
  • 作者简介:qgmiao@xidian.edu.cn
  • 基金资助:

Graph Convolutional Skeleton-based Action Recognition Method for Intelligent Behavior Analysis

MIAO Qi-guang, XIN Wen-tian, LIU Ru-yi, XIE Kun, WANG Quan, YANG Zong-kai   

  1. School of Computer Science and Technology,Xidian University,Xi'an 710071,China
  • Received:2022-01-06 Revised:2022-01-09 Online:2022-02-15 Published:2022-02-23
  • About author:MIAO Qi-guang,born in 1972,Ph.D,professor,Ph.D supervisor,is a senior member of China Computer Federation and AC of CCF YOCSEF.His main research interests include CV and ML.
    LIU Ru-yi,born in 1989,Ph.,lecturer,is a member of China Computer Federation.Her main research interests include computer vision,big data analysis and object detection in remote sen-sing.
  • Supported by:
    National New Engineering Research and Practice Project(E-GCJYZL20200818),Computer Basic Education Teaching Research Project of the National Institute of Computer Basic Education Research Association(2021-AFCEC-459),China Adult Education Association's “14th Five-Year Plan” Adult Continuing Education Research Plan Key Project(2021-414ZA),Key Research/Key Projects of Shaanxi Higher Education Teaching Reform Research(21JG001,21BZ014),Guangxi Key Laboratory of Trusted Software(KX202061,KX202041),Xidian University Education and Teaching Reform Research Key Research Project(A21003),New Experimental Development and New Experimental Equipment Development Key Projects(SY21022I) and Academy of Integrated Circuit Innovation of Xidian University in Chongqing IUR Project(CQIRI-CXYHT-2021-06).

摘要: 智慧教育即教育信息化,是利用现代信息技术的新一代教育模式,智慧行为分析是智慧教育系统的核心组成。在面对复杂的教室应用场景时,针对传统的行为识别分类算法的精确性与时效性都存在严重不足的问题,提出了一种基于分离与注意力机制的图卷积(Depthwise Separable Attention Graph Convolutional Network,DSA-GCN)骨架动作识别算法。首先,为解决传统算法在通道域信息聚合天生不充分的难题,通过逐点卷积进行多维通道映射,将时空图卷积对输入骨骼序列的原始时空信息的保护能力与深度可分离卷积在空间和通道特征学习上的分离能力相结合,以增强模型特征学习与抽象表达性。其次,采用多维度融合的注意力机制,在空间卷积域利用自注意力与通道注意力机制来提升模型的动态敏感性,在时间卷积域利用时间与通道注意力融合法来增强对关键帧的判别力。实验结果表明,在NTU RGB+D 和 N-UCLA两个大型数据集上,DSA-GCN都获得了优异的性能和效能表现,证明了模型对通道域信息聚合能力的提升。

关键词: 骨架动作分类, 深度可分离卷积, 图卷积神经网络, 行为识别, 智慧行为分析, 注意力机制

Abstract: Smart education is a new education model using modern information technology,and smart behavior analysis is the core component.In the complex classroom scenarios,traditional action recognition algorithms are seriously deficient in accuracy and timeliness.A graph convolutional method based on separation and attention mechanism (DSA-GCN) is proposed to solve the above problems.First,in order to solve the challenge that traditional algorithms are inherently inadequate in aggregating information in the channel domain,multidimensional channel mapping is performed by point-wise convolution,combining the ability of ST-GC to preserve the original spatio-temporal information with the separation ability of depth-separable convolution in spatial and channel feature learning to enhance model feature learning and abstract expressivity.Second,a multi-dimensional fused attention mechanism is used to enhance the model dynamic sensitivity in the spatial convolution domain using self-attention and channel attention mechanisms,and to enhance the key frame discrimination in the temporal convolution domain using temporal and channel attention fusion method.Experiment results show that DSA-GCN achieves better accuracy and effectiveness performance on NTU RGB+D and N-UCLA datasets,and prove the improvement of the ability to aggregate channel information.

Key words: Action recognition, Attention mechanism, Depth-wise separable convolution, Graph convolutional neural network, Skeleton-based action classification, Smart behavior analysis


  • TP391
[1]VEMULAPALLI R,ARRATE F,CHELLAPPA R.Human action recognition by representing 3d skeletons as points in a lie group[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:588-595.
[2]FERNANDO B,GAVVES E,ORAMAS J M,et al.Modelingvideo evolution for action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5378-5387.
[3]DU Y,WANG W,WANG L.Hierarchical recurrent neural network for skeleton based action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1110-1118.
[4]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Thirty-second AAAI Conference on Artificial Intelligence.2018.
[5]SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12026-12035.
[6]LIU Z,ZHANG H,CHEN Z,et al.Disentangling and unifying graph convolutions for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:143-152.
[7]YE F,PU S,ZHONG Q,et al.Dynamic GCN:Context-enriched topology learning for skeleton-based action recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:55-63.
[8]DEGARDIN B,LOPES V,PROENÇA H.REGINA-Reasoning Graph Convolutional Networks in Human Action Recognition[J].arXiv:2105.06711,2021.
[9]QIN Z,LIU Y,JI P,et al.Leveraging Third-Order Features in Skeleton-Based Action Recognition[J].arXiv:2105.01563,2021.
[10]CHEN Z,LI S,YANG B,et al.Multi-Scale Spatial TemporalGraph Convolutional Network for Skeleton-Based Action Re-cognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:1113-1122.
[11]CHEN Y,ZHANG Z,YUAN C,et al.Channel-wise topologyrefinement graph convolution for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2021:13359-13368.
[12]SI C,JING Y,WANG W,et al.Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:103-118.
[13]TANG Y,TIAN Y,LU J,et al.Deep progressive reinforcement learning for skeleton-based action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5323-5332.
[14]ZHANG X,XU C,TIAN X,et al.Graph edge convolutionalneural networks for skeleton-based action recognition[J].IEEE Transactions on Neural Networks and Learning Systems,2019,31(8):3047-3060.
[15]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[16]BRUNA J,ZAREMBA W,SZLAM A,et al.Spectral networks and locally connected networks on graphs[J].arXiv:1312.6203,2013.
[17]HAMMOND D K,VANDERGHEYNST P,GRIBONVAL R.Wavelets on graphs via spectral graph theory[J].Applied and Computational Harmonic Analysis,2011,30(2):129-150.
[18]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+ d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1010-1019.
[19]WANG J,NIE X,XIA Y,et al.Cross-view action modeling,learning and recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2014:2649-2656.
[1] 周芳泉, 成卫青.
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 熊丽琴, 曹雷, 赖俊, 陈希亮.
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5] 饶志双, 贾真, 张凡, 李天瑞.
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 孙奇, 吉根林, 张杰.
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 檀莹莹, 王俊丽, 张超波.
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[9] 闫佳丹, 贾彩燕.
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[10] 汪鸣, 彭舰, 黄飞虎.
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[11] 李宗民, 张玉鹏, 刘玉杰, 李华.
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[12] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[13] 金方焱, 王秀利.
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[14] 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚.
Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism
计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[15] 彭双, 伍江江, 陈浩, 杜春, 李军.
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
Full text



No Suggested Reading articles found!