计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 89-98.doi: 10.11896/jsjkx.250800007
陈海涛1, 梁俊威2, 陈晨3, 王宇帆4, 周宇1
CHEN Haitao1, LIANG Junwei2, CHEN Chen3, WANG Yufan4, ZHOU Yu1
摘要: 在智能体育与教育信息化的背景下,精细化的人体动作识别已成为体育教学与训练评估中的关键技术。针对传统动作识别方法在复杂运动场景中存在的模态信息利用不足、时空结构表达受限等问题,提出了一种融合骨架数据与可穿戴传感器信息的多模态图卷积网络模型。首先,提出了一种基于“虚拟传感器”的融合方法,将可穿戴传感器信号映射至骨架关节构建的时空图结构中并融合,有效提升了对动作细节的建模能力与跨模态语义一致性。其次,构建了针对复杂运动模式的多层图卷积网络,通过对身体进行局部划分,增强了模型在复杂体育场景下的识别能力。此外,面向击剑这一技术动作复杂的竞技项目,自主采集并构建了一套涵盖不同典型技术动作与运动水平层次的多模态数据集,为精细化动作识别与水平评估提供了数据支持。在该数据集与多个标准数据集上进行的实验表明,所提方法在动作识别精度与技术水平判断上优于现有主流方法,为体育教育场景中的智能识别与评估提供了新的建模框架与技术支持,具有良好的应用前景。
中图分类号:
| [1]SONG Y F,ZHANG Z,SHAN C,et al.Constructing stronger and faster baselines for skeleton-based action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(2):1474-1488. [2]RAO D S,RAO L K,BHAGYARAJU V,et al.Enhanced Depth Motion Maps for Improved Human Action Recognition from Depth Action Sequences[J].Traitement du Signal,2024,41(3):1461-1472. [3]LAI Y T,LIN C H,CHOU P Y.Real-Time Point Cloud Action Recognition System with Automated Point Cloud Preprocessing[C]//2024 IEEE International Conference on Consumer Electronics(ICCE).IEEE,2024:1-7. [4]YANG Y,YANG H,LIU Z,et al.Fall detection system based on infrared array sensor and multi-dimensional feature fusion[J].Measurement,2022,192:110870. [5]ZHU K,WONG A,MCPHEE J.Fencenet:Fine-grainedfoot-work recognition in fencing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:3589-3598. [6]TAO W,CHEN H,MONIRUZZAMAN M,et al.Attention-Based Sensor Fusion for Human Activity Recognition Using IMU Signals[J].arXiv:2112.11224,2021. [7]AHMAD Z,KHAN N.Towards improved human action recognition using convolutional neural networks and multimodal fusion of depth and inertial sensor data[C]//2018 IEEE International Symposium on Multimedia(ISM).IEEE,2018:223-230. [8]AKTAS M E,AKBAS E,FATMAOUI A E.Persistence homo-logy of networks:methods and applications[J].Applied Network Science,2019,4(1):1-28. [9]LE V T,TRAN-TRUNG K,HOANG V T.A comprehensive review of recent deep learning techniques for human activity re-cognition[J].Computational Intelligence and Neuroscience,2022,2022(1):8323962. [10]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2018. [11]LI M,CHEN S,CHEN X,et al.Actional-structural graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3595-3603. [12]SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12026-12035. [13]HU L,LIU S,FENG W.Spatial Temporal Graph AttentionNetwork for Skeleton-Based Action Recognition[J].arXiv:2208.08599,2022. [14]CHEN Z,LI S,YANG B,et al.Multi-Scale Spatial TemporalGraph Convolutional Network for Skeleton-Based Action Re-cognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:1113-1122. [15]SUN Z,KE Q,RAHMANI H,et al.Human action recognition from various data modalities:A review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(3):3200-3225. [16]BOULAHIA S Y,AMAMRA A,MADI M R,et al.Early,intermediate and late fusion strategies for robust deep learning-based multimodal action recognition[J].Machine Vision and Applications,2021,32(6):121. [17]CHEN T,MO L.Swin-fusion:swin-transformer with feature fusion for human action recognition[J].Neural Processing Letters,2023,55(8):11109-11130. [18]QIU S,FAN T,JIANG J,et al.A novel two-level interactive action recognition model based on inertial data fusion[J].Information Sciences,2023,633:264-279. [19]CHOI H,BEEDU A,ESSA I.Multimodal Contrastive Learning with Hard Negative Sampling for Human Activity Recognition[J].arXiv:2309.01262,2023. [20]HU Z,XIAO J,LI L,et al.Human-centric multimodal fusionnetwork for robust action recognition[J].Expert Systems with Applications,2024,239:122314. [21]YUAN Z,YANG Z,NING H,et al.Multiscale knowledge distillation with attention based fusion for robust human activity re-cognition[J].Scientific Reports,2024,14(1):12411. [22]CHEN Z,SONG X,ZHANG Y,et al.Intelligent Recognition of Physical Education Teachers’ Behaviors Using Kinect Sensors and Machine Learning[J].Sensors & Materials,2022,34(3):1241-1253. [23]HAN J Z,ZHAO J J,YUE Y,et al.Edge Computing-based Vi-deo Action Recognition Method and Its Application in Online Physical Education Teaching[J].IEEE Access,2024,12:148666-148676. [24]DING X,PENG W,YI X.Evaluation of physical educationteaching effect based on action skill recognition[J].Computational Intelligence and Neuroscience,2022,2022(1):9489704. [25]FU D,CHEN L,CHENG Z.Integration of wearable smart devices and internet of things technology into public physical education[J].Mobile Information Systems,2021,2021(1):6740987. [26]SRI-IESARANUSORN P,GARCIA F C,TIAUSAS F,et al.Toward the perfect stroke:A multimodal approach for table tennis stroke evaluation[C]//2021 Thirteenth International Conference on Mobile Computing and Ubiquitous Network(ICMU).IEEE,2021:1-5. [27]YUAN H,NI D,WANG M.Spatio-temporal dynamic inference network for group activity recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:7476-7485. [28]HU L,LIU S,FENG W.Spatial temporal graph attention net-work for skeleton-based action recognition[J].arXiv:2208.08599,2022. [29]DUHME M,MEMMESHEIMER R,PAULUS D.Fusion-gcn:Multimodal action recognition using graph convolutional networks[C]//DAGM German Conference on Pattern Recognition.Cham:Springer,2021:265-281. [30]IJAZ M,DIAZ R,CHEN C.Multimodal transformer for nursing activity recognition[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2022:2065-2074 [31]KONG Q,WU Z,DENG Z,et al.Mmact:A large-scale dataset for cross modal human action understanding[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:8658-8667. [32]CHAO X,HOU Z,MO Y.CZU-MHAD:a multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors[J].IEEE Sensors Journal,2022,22(7):7034-7042. [33]CHEN C,JAFARI R,KEHTARNAVAZ N.UTD-MHAD:A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor[C]//2015 IEEE International Conference on Image Processing(ICIP).IEEE,2015:168-172. [34]CHOI H,BEEDU A,HARESAMUDRAM H,et al.Multi-stage based feature fusion of multi-modal data for human activity re-cognition[J].arXiv:2211.04331,2022. [35]GAO Z,WANG Y,CHEN J,et al.Mmtsa:Multi-modal temporal segment attention network for efficient human activity recognition[J].Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies,2023,7(3):1-26. [36]NI J,SARBAJNA R,LIU Y,et al.Cross-modal knowledge distillation for vision-to-sensor action recognition[C]//ICASSP 2022-2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2022:4448-4452. [37]LI C,HUANG Q,MAO Y.Dd-gcn:Directed diffusion graphconvolutional network for skeleton-based human action recognition[C]//2023 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2023:786-791 [38]CHENG K,ZHANG Y,HE X,et al.Skeleton-based action re-cognition with shift graph convolutional network[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:183-192. [39]LIU X,YUAN G,BING R,et al.When Skeleton Meets Motion:Adaptive Multimodal Graph Representation Fusion for Action Recognition[C]//2024 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2024:1-6. [40]WU H,MA X,LI Y.Spatiotemporal multimodal learning with 3D CNNs for video action recognition[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(3):1250-1261. [41]ZHAO C,CHEN M,ZHAO J,et al.3d behavior recognition based on multi-modal deep space-time learning[J].Applied Sciences,2019,9(4):716. [42]CHAO X,JI G,QI X.Multi-view key information representation and multi-modal fusion for single-subject routine action recognition[J].Applied Intelligence,2024,54(4):3222-3244. |
|
||