Computer Science ›› 2025, Vol. 52 ›› Issue (5): 220-226.doi: 10.11896/jsjkx.240600125

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Hypergraph Convolutional Network with Multi-perspective Topology Refinement forSkeleton-based Action Recognition

HUANG Qian, SU Xinkai, LI Chang, WU Yirui   

  1. College of Computer Science and Software Engineering,Hohai University,Nanjing 211106,China
  • Received:2024-06-21 Revised:2024-11-08 Online:2025-05-15 Published:2025-05-12
  • About author:HUANG Qian,born in 1981,Ph.D,is a senior member of CCF(No.08758S).His main research interests include industry-specific multi-media computing and so on.
    SU Xinkai,born in 1998,postgraduate,is a member of CCF(No.T0865G).His main research interests include compu-ter vision and so on.

Abstract: Since the human skeleton is a natural topological structure,graph convolutional networks(GCNs) are widely used for skeleton-based human action recognition.In recent research,skeleton sequences are represented as spatio-temporal graphs and topology graphs are used to model the correlation between human joints.However,current GCN-based methods only focus on pairwise joint relationships and ignore potential high-order relationships beyond pairwise relationships,leading to underutilization of the graph structure of skeleton data.To solve this problem,this paper proposes the concept of hypergraph to represent potential high-order relationships of joints.Since the high-order relationships of joints within each frame in the skeleton sequence may vary,the model dynamically learns the high-order correlations within each frame with the K-NN method and initialize the hypergraph structure using the high-level representation of joints.This hypergraph structure can better learn the high-order relationships between joints as the hyperedges dynamically adjust with the evolution of joint features.In current hypergraph neural networks,hypergraph convolution transforms the hypergraph into a simple graph using the Laplace's transformation and then performs graph convolution.This method does not fully utilize the characteristics of the hypergraph.The proposed hypergraph convolution method better utilizes the relationship between hyperedges and hypernodes in the hypergraph,performing hyperedge graph convolution on each hyperedge to learn the high-order relationships between joints.The second problem with current GCN-based human action recognition methods is that the topology built by GCNs to represent pairwise joint relationships is not dynamic enough,such as using the same topology for all frames in a sample.To fully explore the dynamic correlation between pairwise joints,the frame-wise topology modeling method is proposed to capture correlation between pairwise joints under different frames and channel-level topology modeling method is proposed to capture correlation between different feature types.Finally,a hypergraph convolution network with multi-perspective topology refinement(HyperMTR-GCN) is developedfor skeleton-based action recognition,which has a significant advantage on the NTU RGB+D and NTU RGB+D 120 datasets.Specifically,it improves by 3.7% on the X-sub benchmark of NTU RGB+D and by 5.7% on the X-sub benchmark of NTU RGB+D 120 compared to 2s-AGCN.

Key words: Action recognition, Graph convolutional network, Hypergraph neural network, Skeleton modeling, Topology refinement

CLC Number: 

  • TP391.41
[1]JIANG Y G,DAI Q,LIU W,et al.Human action recognition in unconstrained videos by explicit motion modeling [J].IEEE Transactions on Image Processing,2015,24(11):3781-3795.
[2]GAUR U,ZHU Y,SONG B,et al.A “string of feature graphs” model for recognition of complex activities in natural videos[C]//Proceedings of the 2011 International Conference on Computer Vision.Barcelona,Spain,2011:2595-2602.
[3]YAN S J,XIONG Y J,LIN D H.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:7444-7452.
[4]SHI L,ZHANG Y F,CHENG J,et al.Two-stream adaptivegraph convolutional networks for skeleton-based action recognition[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:12018-12027.
[5]YE F F,PU S L,ZHONG Q Y,et al.Dynamic gcn:Context-enriched topology learning for skeleton-based action recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia.Seattle,WA,USA,2020:55-63.
[6]HAO X K,LI J,GUO Y C,et al.Hypergraph neural networkfor skeleton-based action recognition [J].IEEE Transactions on Image Processing,2021,30:2263-2275.
[7]ZHU Y,CHEN W B,GUO G D.Fusing spatiotemporal features and joints for 3D action recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition Workshops.Portland,OR,USA,2013:486-491.
[8]WANG J,NIE X H,XIA Y,et al.Cross-view action modeling,learning,and recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).2014:2649-2656.
[9]HAMMONE D K,VANDERGHEYNST P,GRIBONVAL R.Wavelets on graphs via spectral graph theory[J].Applied and Computational Harmonic Analysis,2011,30(2):129-150.
[10]TANG Y S,TIAN Y,LU J W,et al.Deep progressive reinforcement learning for skeleton-based action recognition[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA,2018:5323-5332.
[11]SHI L,ZHANG Y,CHENG J,et al.Skeleton-Based Action Re-cognition With Directed Graph Neural Networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:7904-7913.
[12]LI M,CHEN S H,CHEN X,et al.Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA,2019:3590-3598.
[13]CHENG K,ZHANG Y,HE X,et al.Skeleton-Based ActionRecognition with Shift Graph Convolutional Network[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:180-189.
[14]CHEN Y X,ZHANG Z Q,YUAN C F,et al.Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).Montreal,QC,Canada,2021:13339-13348.
[15]THAKKAR K,NARAYANAN P J.Part-based graph convolutional network for action recognition[C]//Proceedings of the Brit.Mach.Vis.Conf.(BMVC).2018:270-283.
[16]HUANG L,HUANG Y,OUYANG W,et al.Part-Level GraphConvolutional Network for Skeleton-Based Action Recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:11045-11052.
[17]LIU S Y,LV P,ZHANG Y Z,et al.Semi-dynamic hypergraph neural network for 3d pose estimation[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.Yokohama,Yokohama,Japan,2021.
[18]BAI S,ZHANG F H,TORR P H S.Hypergraph convolutionand hypergraph attention[J].Pattern Recognition,2021,110(1):1-8.
[19]ZHOU Y X,LI C,CHENG Z Q,et al.Hypergraph Transformer for Skeleton-based Action Recognition [EB/OL].https://api.semanticscholar.org/CorpusID:253581243.
[20]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA,2016:1010-1019.
[21]LIU J,SHAHROUDY A,PEREZ M,et al.Ntu rgb+d 120:A large-scale benchmark for 3d human activity understanding [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(10):2684-2701.
[22]LI C,MAO Y C,HUANG Q,et al.Scale-Aware Graph Convolutional Network with Part-Level Refinement for Skeleton-Based Human Action Recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(6):4311-4324.
[23]ZHU X W,HUANG Q,LI C,et al.Skeleton-Based Action Recognition with Combined Part-Wise Topology Graph Convolutional Networks[C]//Pattern Recognition and Computer Vision(PRCV 2023).2023:43-59.
[24]ZHANG P F,LAN C L,ZENG W J,et al.Semantics-guided neural networks for efficient skeleton-based human action recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:1109-1118.
[25]LIU Z Y,ZHANG H W,CHEN Z H,et al.Disentangling andunifying graph convolutions for skeleton-based action recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle,WA,USA,2020:140-149.
[26]SONG Y F,ZHANG Z,SHAN C F,et al.Richly activated graph convolutional network for robust skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(5):1915-1925.
[27]FENG D,WU Z C,ZHANG J,et al.Multi-scale spatial temporal graph neural network for skeleton-based action recognition [J].IEEE Access,2021,9:58256-58265.
[28]WU C,WU X J,KITTLER J.Graph2net:Perceptually-enriched graph learning for skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(4):2120-2132.
[29]XU K L,YE F F,ZHONG Q Y,et al.Topology-aware convolutional neural network for efficient skeleton-based action recognition [C]//AAAI Conference on Artificial Intelligence.2021:2866-2874.
[30]WEN Y H,GAO L,FU H B,et al.Motifgcns with local andnon-local temporal blocks for skeleton-based action recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2009-2023.
[31]HUANG Z X,QIN Y S,LIN X,et al.Motiondriven spatial and temporal adaptive high-resolution graph convolutional networks for skeleton-based action recognition [J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(4):1868-1883.
[32]LIN L,ZHANG J,LIU J.Actionlet-Dependent ContrastiveLearning for Unsupervised Skeleton-Based Action Recognition[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Vancouver,BC,Canada,2023:2363-2372.
[33]SONG Y F,ZHANG Z,SHAN C F,et al.Constructing stronger and faster baselines for skeleton-based action recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):1474-1488.
[34]HUA Y,WU W,ZHENG C,et al.Part Aware ContrastiveLearning for Self-Supervised Action Recognition[C]//Procee-dings of the Thirty-Second International Joint Conference on Artificial Intelligence.Macao,China,2023:855-863.
[35]ZHU Y S,HAN H,YU Z T,et al.Modeling the relative visual tempo for self-supervised skeleton-based action recognition[C]//2023 IEEE/CVF International Conference on Computer Vision(ICCV).Paris,France,2023:13867-13876.
[1] GAO Tai, REN Yanzhang, WANG Huiqing, LI Ying, WANG Bin. KGMamba:Gene Regulatory Network Prediction Model Based on Kolmogorov-Arnold Network Optimizing Graph Convolutional Network and Mamba [J]. Computer Science, 2026, 53(4): 101-111.
[2] PENG Juhong, ZHANG Zhengyue, DING Zixu, FAN Xinyu, HU Changyu, ZHAO Mingjun. Multi-view Local Language Feature and Global Feature Fusion for Conversational Aspect-based Sentiment Quadruple Analysis [J]. Computer Science, 2026, 53(4): 384-392.
[3] CHEN Haitao, LIANG Junwei, CHEN Chen, WANG Yufan, ZHOU Yu. Multimodal Physical Education Data Fusion via Graph Alignment for Action Recognition [J]. Computer Science, 2026, 53(2): 89-98.
[4] CHANG Xuanwei, DUAN Liguo, CHEN Jiahao, CUI Juanjuan, LI Aiping. Method for Span-level Sentiment Triplet Extraction by Deeply Integrating Syntactic and Semantic
Features
[J]. Computer Science, 2026, 53(2): 322-330.
[5] ZHAI Jie, LI Yanhao, CHEN Lexuan, GUO Weibin. Dynamic Recommendation of Personalized Hands-on Learning Materials Based on LightweightEducational LLMs [J]. Computer Science, 2026, 53(2): 48-56.
[6] HU Hailong, XU Xiangwei, LI Yaqian. Drug Combination Recommendation Model Based on Dynamic Disease Modeling [J]. Computer Science, 2025, 52(9): 96-105.
[7] WANG Jia, XIA Ying, FENG Jiangfan. Few-shot Video Action Recognition Based on Two-stage Spatio-Temporal Alignment [J]. Computer Science, 2025, 52(8): 251-258.
[8] LI Mengxi, GAO Xindan, LI Xue. Two-way Feature Augmentation Graph Convolution Networks Algorithm [J]. Computer Science, 2025, 52(7): 127-134.
[9] JIANG Kun, ZHAO Zhengpeng, PU Yuanyuan, HUANG Jian, GU Jinjing, XU Dan. Cross-modal Hypergraph Optimisation Learning for Multimodal Sentiment Analysis [J]. Computer Science, 2025, 52(7): 210-217.
[10] BIAN Hui, MENG Changqian, LI Zihan, CHEN Zihaoand XIE Xuelei. Continuous Sign Language Recognition Based on Graph Convolutional Network and CTC/Attention [J]. Computer Science, 2025, 52(6A): 240400098-9.
[11] TAN Qiyin, YU Jiong, CHEN Zixin. Outlier Detection Method Based on Adaptive Graph Autoencoder [J]. Computer Science, 2025, 52(6): 129-138.
[12] ZHANG Jiaxiang, PAN Min, ZHANG Rui. Study on EEG Emotion Recognition Method Based on Self-supervised Graph Network [J]. Computer Science, 2025, 52(5): 122-127.
[13] ZHAO Hongyi, LI Zhiyuan, BU Fanliang. Multi-language Embedding Graph Convolutional Network for Hate Speech Detection [J]. Computer Science, 2025, 52(11A): 241200023-8.
[14] ZHAO Zhuoyang, QIN Donghong, BAI Fengbo, LIANG Xianye, XU Chen, ZHENG Yuehua, LIANG Yufeng, LAN Sheng, ZHOU Guoping. ZHA_TGCN:A Topic Classification Method for Low-resource Sawcuengh Language [J]. Computer Science, 2025, 52(11A): 250100059-8.
[15] HU Jintao, XIAN Guangming. Self-attention-based Graph Contrastive Learning for Recommendation [J]. Computer Science, 2025, 52(11): 82-89.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!