计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 231000073-5.doi: 10.11896/jsjkx.231000073
黄海新, 王钰瑶, 蔡明启
HUANG Haixin, WANG Yuyao, CAI Mingqi
摘要: 动作识别方法在计算机视觉领域取得了显著的效果,其中图卷积网络是动作识别任务的一种重要手段,在提取图结构数据的特征中表现出了卓越优势。然而,现有的图卷积动作识别网络仍存在一些问题,如过度依赖预定义骨架拓扑图结构、大时间卷积核计算成本高且缺乏灵活性等,这些问题极大限制了模型的表达能力和鲁棒性。文中提出了一种基于骨架数据的自适应瓶颈层多尺度图卷积动作识别方法,自适应空间模块对骨架拓扑图结构和参数进行优化学习,从而增强模型灵活性和适应性;瓶颈层多尺度时序模块提高时间建模能力,通过减少通道宽度来节省计算成本和参数。为验证所提方法的有效性,在大型骨架动作识别数据集NTU-RGB+D和NTU-RGB+D 120上进行实验。结果证明,改进后的算法的准确率得到了一定提升。
中图分类号:
[1]SHOTTON J,FITZGIBBON A,COOK M,et al.Real-time hu-man pose recognition in parts from single depth images[C]//CVPR 2011.IEEE,2011:1297-1304. [2]LIU J,RAHMANI H,AKHTAR N,et al.Learning human pose models from synthesized data for robust RGB-D action recognition[J].International Journal of Computer Vision,2019,127:1545-1564. [3]SUN K,XIAO B,LIU D,et al.Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5693-5703. [4]GONG J,FAN Z,KE Q,et al.Meta agent teaming active learning for pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11079-11089. [5]XIN W,LIU R,LIU Y,et al.Transformer for Skeleton-basedaction recognition:A review of recent advances[J].Neurocomputing,2023,537:164-186. [6]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018. [7]LI M,CHEN S,CHEN X,et al.Actional-structural graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3595-3603. [8]CHENG K,ZHANG Y,HE X,et al.Skeleton-based action rec-ognition with shift graph convolutional network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:183-192. [9]HEDEGAARD L,HEIDARI N,IOSIFIDIS A.Online skeleton-based action recognition with continual spatio-temporal graph convolutional networks[J].arXiv:2203.11009,2022. [10]LEE J,LEE M,LEE D,et al.Hierarchically decomposed graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:10444-10453. [11]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+ d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1010-1019. [12]LIU J,SHAHROUDY A,PEREZ M,et al.Ntu rgb+ d 120:A large-scale benchmark for 3d human activity understanding[J].IEEE transactions on pattern analysis and machine intelligence,2019,42(10):2684-2701. [13]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [14]HU J F,ZHENG W S,LAI J,et al.Jointly learning heterogene-ous features for RGB-D activity recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5344-5352. [15]SOO KIM T,REITER A.Interpretable 3d human action analysis with temporal convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017:20-28. [16]CAETANO C,SENA J,BRÉMONDF,et al.Skelemotion:Anew representation of skeleton joint sequences based on motion information for 3d action recognition[C]//2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS).IEEE,2019:1-8. [17]SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12026-12035. [18]ZHANG X,XU C,TAO D.Context aware graph convolution for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:14333-14342. [19]LI L,WANG M,NI B,et al.3d human action representationlearning via cross-view consistency pursuit[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4741-4750. [20]ZHANG J,YE G,TU Z,et al.A spatial attentive and temporal dilated(SATD) GCN for skeleton-based action recognition[J].CAAI Transactions on Intelligence Technology,2022,7(1):46-55. |
|