Computer Science ›› 2025, Vol. 52 ›› Issue (5): 220-226.doi: 10.11896/jsjkx.240600125

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Hypergraph Convolutional Network with Multi-perspective Topology Refinement forSkeleton-based Action Recognition

HUANG Qian, SU Xinkai, LI Chang, WU Yirui   

  1. College of Computer Science and Software Engineering,Hohai University,Nanjing 211106,China
  • Received:2024-06-21 Revised:2024-11-08 Online:2025-05-15 Published:2025-05-12
  • About author:HUANG Qian,born in 1981,Ph.D,is a senior member of CCF(No.08758S).His main research interests include industry-specific multi-media computing and so on.
    SU Xinkai,born in 1998,postgraduate,is a member of CCF(No.T0865G).His main research interests include compu-ter vision and so on.

Abstract: Since the human skeleton is a natural topological structure,graph convolutional networks(GCNs) are widely used for skeleton-based human action recognition.In recent research,skeleton sequences are represented as spatio-temporal graphs and topology graphs are used to model the correlation between human joints.However,current GCN-based methods only focus on pairwise joint relationships and ignore potential high-order relationships beyond pairwise relationships,leading to underutilization of the graph structure of skeleton data.To solve this problem,this paper proposes the concept of hypergraph to represent potential high-order relationships of joints.Since the high-order relationships of joints within each frame in the skeleton sequence may vary,the model dynamically learns the high-order correlations within each frame with the K-NN method and initialize the hypergraph structure using the high-level representation of joints.This hypergraph structure can better learn the high-order relationships between joints as the hyperedges dynamically adjust with the evolution of joint features.In current hypergraph neural networks,hypergraph convolution transforms the hypergraph into a simple graph using the Laplace's transformation and then performs graph convolution.This method does not fully utilize the characteristics of the hypergraph.The proposed hypergraph convolution method better utilizes the relationship between hyperedges and hypernodes in the hypergraph,performing hyperedge graph convolution on each hyperedge to learn the high-order relationships between joints.The second problem with current GCN-based human action recognition methods is that the topology built by GCNs to represent pairwise joint relationships is not dynamic enough,such as using the same topology for all frames in a sample.To fully explore the dynamic correlation between pairwise joints,the frame-wise topology modeling method is proposed to capture correlation between pairwise joints under different frames and channel-level topology modeling method is proposed to capture correlation between different feature types.Finally,a hypergraph convolution network with multi-perspective topology refinement(HyperMTR-GCN) is developedfor skeleton-based action recognition,which has a significant advantage on the NTU RGB+D and NTU RGB+D 120 datasets.Specifically,it improves by 3.7% on the X-sub benchmark of NTU RGB+D and by 5.7% on the X-sub benchmark of NTU RGB+D 120 compared to 2s-AGCN.

Key words: Action recognition, Graph convolutional network, Hypergraph neural network, Skeleton modeling, Topology refinement

CLC Number: 

  • TP391.41
[1]JIANG Y G,DAI Q,LIU W,et al.Human action recognition in unconstrained videos by explicit motion modeling [J].IEEE Transactions on Image Processing,2015,24(11):3781-3795.
[2]GAUR U,ZHU Y,SONG B,et al.A “string of feature graphs” model for recognition of complex activities in natural videos[C]//Proceedings of the 2011 International Conference on Computer Vision.Barcelona,Spain,2011:2595-2602.
[3]YAN S J,XIONG Y J,LIN D H.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:7444-7452.
[4]SHI L,ZHANG Y F,CHENG J,et al.Two-stream adaptivegraph convolutional networks for skeleton-based action recognition[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:12018-12027.
[5]YE F F,PU S L,ZHONG Q Y,et al.Dynamic gcn:Context-enriched topology learning for skeleton-based action recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia.Seattle,WA,USA,2020:55-63.
[6]HAO X K,LI J,GUO Y C,et al.Hypergraph neural networkfor skeleton-based action recognition [J].IEEE Transactions on Image Processing,2021,30:2263-2275.
[7]ZHU Y,CHEN W B,GUO G D.Fusing spatiotemporal features and joints for 3D action recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition Workshops.Portland,OR,USA,2013:486-491.
[8]WANG J,NIE X H,XIA Y,et al.Cross-view action modeling,learning,and recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).2014:2649-2656.
[9]HAMMONE D K,VANDERGHEYNST P,GRIBONVAL R.Wavelets on graphs via spectral graph theory[J].Applied and Computational Harmonic Analysis,2011,30(2):129-150.
[10]TANG Y S,TIAN Y,LU J W,et al.Deep progressive reinforcement learning for skeleton-based action recognition[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA,2018:5323-5332.
[11]SHI L,ZHANG Y,CHENG J,et al.Skeleton-Based Action Re-cognition With Directed Graph Neural Networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:7904-7913.
[12]LI M,CHEN S H,CHEN X,et al.Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:3590-3598.
[13]CHENG K,ZHANG Y,HE X,et al.Skeleton-Based ActionRecognition with Shift Graph Convolutional Network[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:180-189.
[14]CHEN Y X,ZHANG Z Q,YUAN C F,et al.Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).Montreal,QC,Canada,2021:13339-13348.
[15]THAKKAR K,NARAYANAN P J.Part-based graph convolutional network for action recognition[C]//Proceedings of the Brit.Mach.Vis.Conf.(BMVC).2018:270-283.
[16]HUANG L,HUANG Y,OUYANG W,et al.Part-Level GraphConvolutional Network for Skeleton-Based Action Recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:11045-11052.
[17]LIU S Y,LV P,ZHANG Y Z,et al.Semi-dynamic hypergraph neural network for 3d pose estimation[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.Yokohama,Yokohama,Japan,2021.
[18]BAI S,ZHANG F H,TORR P H S.Hypergraph convolutionand hypergraph attention[J].Pattern Recognition,2021,110(1):1-8.
[19]ZHOU Y X,LI C,CHENG Z Q,et al.Hypergraph Transformer for Skeleton-based Action Recognition [EB/OL].https://api.semanticscholar.org/CorpusID:253581243.
[20]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA,2016:1010-1019.
[21]LIU J,SHAHROUDY A,PEREZ M,et al.Ntu rgb+d 120:A large-scale benchmark for 3d human activity understanding [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(10):2684-2701.
[22]LI C,MAO Y C,HUANG Q,et al.Scale-Aware Graph Convolutional Network with Part-Level Refinement for Skeleton-Based Human Action Recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(6):4311-4324.
[23]ZHU X W,HUANG Q,LI C,et al.Skeleton-Based Action Recognition with Combined Part-Wise Topology Graph Convolutional Networks[C]//Pattern Recognition and Computer Vision(PRCV 2023).2023:43-59.
[24]ZHANG P F,LAN C L,ZENG W J,et al.Semantics-guided neural networks for efficient skeleton-based human action recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:1109-1118.
[25]LIU Z Y,ZHANG H W,CHEN Z H,et al.Disentangling andunifying graph convolutions for skeleton-based action recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:140-149.
[26]SONG Y F,ZHANG Z,SHAN C F,et al.Richly activated graph convolutional network for robust skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(5):1915-1925.
[27]FENG D,WU Z C,ZHANG J,et al.Multi-scale spatial temporal graph neural network for skeleton-based action recognition [J].IEEE Access,2021,9:58256-58265.
[28]WU C,WU X J,KITTLER J.Graph2net:Perceptually-enriched graph learning for skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(4):2120-2132.
[29]XU K L,YE F F,ZHONG Q Y,et al.Topology-aware convolutional neural network for efficient skeleton-based action recognition [C]//AAAI Conference on Artificial Intelligence.2021:2866-2874.
[30]WEN Y H,GAO L,FU H B,et al.Motifgcns with local andnon-local temporal blocks for skeleton-based action recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2009-2023.
[31]HUANG Z X,QIN Y S,LIN X,et al.Motiondriven spatial and temporal adaptive high-resolution graph convolutional networks for skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(4):1868-1883.
[32]LIN L,ZHANG J,LIU J.Actionlet-Dependent ContrastiveLearning for Unsupervised Skeleton-Based Action Recognition[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Vancouver,BC,Canada,2023:2363-2372.
[33]SONG Y F,ZHANG Z,SHAN C F,et al.Constructing stronger and faster baselines for skeleton-based action recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):1474-1488.
[34]HUA Y,WU W,ZHENG C,et al.Part Aware ContrastiveLearning for Self-Supervised Action Recognition[C]//Procee-dings of the Thirty-Second International Joint Conference on Artificial Intelligence.Macao,China,2023:855-863.
[35]ZHU Y S,HAN H,YU Z T,et al.Modeling the relative visual tempo for self-supervised skeleton-based action recognition[C]//2023 IEEE/CVF International Conference on Computer Vision(ICCV).Paris,France,2023:13867-13876.
[1] ZHANG Jiaxiang, PAN Min, ZHANG Rui. Study on EEG Emotion Recognition Method Based on Self-supervised Graph Network [J]. Computer Science, 2025, 52(5): 122-127.
[2] ZHANG Lu, DUAN Youxiang, LIU Juan, LU Yuxi. Chinese Geological Entity Relation Extraction Based on RoBERTa and Weighted Graph Convolutional Networks [J]. Computer Science, 2024, 51(8): 297-303.
[3] YUAN Lining, FENG Wengang, LIU Zhao. Multi-channel Graph Convolutional Networks Enhanced by Label Propagation Algorithm [J]. Computer Science, 2024, 51(8): 304-312.
[4] LEI Yongsheng, DING Meng, SHEN Yao, LI Juhao, ZHAO Dongyue, CHEN Fushi. Action Recognition Model Based on Improved Two Stream Vision Transformer [J]. Computer Science, 2024, 51(7): 229-235.
[5] ZHANG Xiaoxi, LI Dongxi. Cancer Subtype Prediction Based on Similar Network Fusion Algorithm [J]. Computer Science, 2024, 51(6A): 230500006-7.
[6] HOU Lei, LIU Jinhuan, YU Xu, DU Junwei. Review of Graph Neural Networks [J]. Computer Science, 2024, 51(6): 282-298.
[7] LI Yilin, SUN Chengsheng, LUO Lin, JU Shenggen. Aspect-based Sentiment Classification for Word Information Enhancement Based on Sentence Information [J]. Computer Science, 2024, 51(6): 299-308.
[8] YUAN Rong, PENG Lilan, LI Tianrui, LI Chongshou. Traffic Flow Prediction Model Based on Dual Prior-adaptive Graph Neural ODE Network [J]. Computer Science, 2024, 51(4): 151-157.
[9] YAN Wenjie, YIN Yiying. Human Action Recognition Algorithm Based on Adaptive Shifted Graph Convolutional Neural
Network with 3D Skeleton Similarity
[J]. Computer Science, 2024, 51(4): 236-242.
[10] ZHANG Mingdao, ZHOU Xin, WU Xiaohong, QING Linbo, HE Xiaohai. Unified Fake News Detection Based on Semantic Expansion and HDGCN [J]. Computer Science, 2024, 51(4): 299-306.
[11] YUAN Jing, XIA Ying. Vehicle Trajectory Prediction Based on Spatial-Temporal Graph Attention Convolutional Network [J]. Computer Science, 2024, 51(12): 157-165.
[12] HUANG Haixin, WANG Yuyao, CAI Mingqi. Bottleneck Multi-scale Graph Convolutional Network for Skeleton-based Action Recognition [J]. Computer Science, 2024, 51(11A): 231000073-5.
[13] YANG Yufan, YUAN Liming, WANG Ke, LI Hongyi, LI Yixuan, YAO Yujia, WANG Jingyi. Grading Model for Diabetic Retinopathy Based on Graph Convolutional Network [J]. Computer Science, 2024, 51(11A): 231000042-5.
[14] PENG Guangchuan, WU Fei, HAN Lu, JI Yimu, JING Xiaoyuan. Fake News Detection Based on Cross-modal Interaction and Feature Fusion Network [J]. Computer Science, 2024, 51(11): 23-29.
[15] DUAN Xinran, WANG Mei, HAN Tianli, ZHOU Hongyu, GUO Junqi, JI Weixing, HUANG Hua. Perception and Analysis of Teaching Process Based on Video Understanding [J]. Computer Science, 2024, 51(10): 56-66.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!