Computer Science ›› 2024, Vol. 51 ›› Issue (11A): 231000073-5.doi: 10.11896/jsjkx.231000073

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Bottleneck Multi-scale Graph Convolutional Network for Skeleton-based Action Recognition

HUANG Haixin, WANG Yuyao, CAI Mingqi   

  1. School of Automation and Electrical Engineering,Shenyang Ligong University,Shenyang 110159,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:HUANG Haixin,born in 1973,Ph.D,associate professor.Her main research interests include machine learning,artificial intelligence and intelligent grid.
  • Supported by:
    National Natural Science Foundation of China(61672359).

Abstract: Action recognition methods have achieved significant success in the field of computer vision.Graph convolutional networks(GCNs) are crucial techniques for action recognition tasks,especially for extracting features from graph-structured data.However,existing GCNs suffer from limitations such as an excessive reliance on predefined skeleton topological graphs and a lack of flexibility in handling large temporal convolution kernels,which significantly constrain their expressive power and robustness.In this paper,we propose an adaptive bottleneck multi-scale graph convolutional action recognition method based on skeleton data.The adaptive spatial module optimizes the skeleton topological graph structure and parameters,enhancing the model's flexibi-lity.The bottleneck layer multi-scale temporal module improves the temporal modeling capabilities while reducing channel width to save computational costs and parameters.Experimental results on large-scale skeleton action recognition datasets,NTU-RGB+Dand NTU-RGB+D 120,show that the accuracy of our model is improved to a certain extent.

Key words: Action recognition, Skeleton modality, Graph convolution network, Video classification, Computer vision

CLC Number: 

  • TP183
[1]SHOTTON J,FITZGIBBON A,COOK M,et al.Real-time hu-man pose recognition in parts from single depth images[C]//CVPR 2011.IEEE,2011:1297-1304.
[2]LIU J,RAHMANI H,AKHTAR N,et al.Learning human pose models from synthesized data for robust RGB-D action recognition[J].International Journal of Computer Vision,2019,127:1545-1564.
[3]SUN K,XIAO B,LIU D,et al.Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5693-5703.
[4]GONG J,FAN Z,KE Q,et al.Meta agent teaming active learning for pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11079-11089.
[5]XIN W,LIU R,LIU Y,et al.Transformer for Skeleton-basedaction recognition:A review of recent advances[J].Neurocomputing,2023,537:164-186.
[6]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[7]LI M,CHEN S,CHEN X,et al.Actional-structural graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3595-3603.
[8]CHENG K,ZHANG Y,HE X,et al.Skeleton-based action rec-ognition with shift graph convolutional network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:183-192.
[9]HEDEGAARD L,HEIDARI N,IOSIFIDIS A.Online skeleton-based action recognition with continual spatio-temporal graph convolutional networks[J].arXiv:2203.11009,2022.
[10]LEE J,LEE M,LEE D,et al.Hierarchically decomposed graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:10444-10453.
[11]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+ d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1010-1019.
[12]LIU J,SHAHROUDY A,PEREZ M,et al.Ntu rgb+ d 120:A large-scale benchmark for 3d human activity understanding[J].IEEE transactions on pattern analysis and machine intelligence,2019,42(10):2684-2701.
[13]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[14]HU J F,ZHENG W S,LAI J,et al.Jointly learning heterogene-ous features for RGB-D activity recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5344-5352.
[15]SOO KIM T,REITER A.Interpretable 3d human action analysis with temporal convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017:20-28.
[16]CAETANO C,SENA J,BRÉMONDF,et al.Skelemotion:Anew representation of skeleton joint sequences based on motion information for 3d action recognition[C]//2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS).IEEE,2019:1-8.
[17]SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12026-12035.
[18]ZHANG X,XU C,TAO D.Context aware graph convolution for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:14333-14342.
[19]LI L,WANG M,NI B,et al.3d human action representationlearning via cross-view consistency pursuit[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4741-4750.
[20]ZHANG J,YE G,TU Z,et al.A spatial attentive and temporal dilated(SATD) GCN for skeleton-based action recognition[J].CAAI Transactions on Intelligence Technology,2022,7(1):46-55.
[1] ZHU Fukun, TENG Zhen, SHAO Wenze, GE Qi, SUN Yubao. Semantic-guided Neural Network Critical Data Routing Path [J]. Computer Science, 2024, 51(9): 155-161.
[2] LEI Yongsheng, DING Meng, SHEN Yao, LI Juhao, ZHAO Dongyue, CHEN Fushi. Action Recognition Model Based on Improved Two Stream Vision Transformer [J]. Computer Science, 2024, 51(7): 229-235.
[3] CAI Wenliang, HUANG Jun. Lane Detection Method Based on RepVGG [J]. Computer Science, 2024, 51(7): 236-243.
[4] HUANG Haixin, CAI Mingqi, WANG Yuyao. Review of Point Cloud Semantic Segmentation Based on Graph Convolutional Neural Networks [J]. Computer Science, 2024, 51(6A): 230400196-7.
[5] HUANG Chungan, WANG Guiping, WU Bo, BAI Xin. Diversified Recommendation Based on Light Graph Convolution Networks and ImplicitFeedback Enhancement [J]. Computer Science, 2024, 51(6A): 230900038-11.
[6] LU Dongsheng, LONG Hua. Method for Homologous Spectrum Monitoring Data Identification Based on Spectrum SIFT [J]. Computer Science, 2024, 51(6A): 230300177-7.
[7] YUAN Zhen, LIU Jinfeng. Denoising Autoencoders Based on Lossy Compress Coding [J]. Computer Science, 2024, 51(6A): 230400172-7.
[8] WANG Yifan, ZHANG Xuefang. Modality Fusion Strategy Research Based on Multimodal Video Classification Task [J]. Computer Science, 2024, 51(6A): 230300212-5.
[9] YAN Wenjie, YIN Yiying. Human Action Recognition Algorithm Based on Adaptive Shifted Graph Convolutional Neural
Network with 3D Skeleton Similarity
[J]. Computer Science, 2024, 51(4): 236-242.
[10] CAI Jiacheng, DONG Fangmin, SUN Shuifa, TANG Yongheng. Unsupervised Learning of Monocular Depth Estimation:A Survey [J]. Computer Science, 2024, 51(2): 117-134.
[11] ZHOU Yan, XU Yewen, PU Lei, XU Xuemiao, LIU Xiangyu, ZHOU Yuexia. Research Progress of Image 3D Object Detection in Autonomous Driving Scenario [J]. Computer Science, 2024, 51(11): 133-147.
[12] LIU Feng, LIU Yaxuan, CHAI Xinyu, JI Haohan, ZHENG Zhixing. Computational Perception Technologies in Intelligent Education:Systematic Review [J]. Computer Science, 2024, 51(10): 10-16.
[13] DUAN Xinran, WANG Mei, HAN Tianli, ZHOU Hongyu, GUO Junqi, JI Weixing, HUANG Hua. Perception and Analysis of Teaching Process Based on Video Understanding [J]. Computer Science, 2024, 51(10): 56-66.
[14] LI Jia'nan, LI Ruiyi, ZHAO Zhifu, SONG Juan, HAN Jialong, ZHU Tong. Recognition and Analysis of Teaching Behavior Based on Multi-scale GCN [J]. Computer Science, 2024, 51(10): 135-143.
[15] CUI Zhenyu, ZHOU Jiahuan, PENG Yuxin. Survey on Cross-modality Object Re-identification Research [J]. Computer Science, 2024, 51(1): 13-25.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!