计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 220800147-6.doi: 10.11896/jsjkx.220800147
李华, 赵领娣, 陈雨杰, 杨杨, 杜新兆
LI Hua, ZHAO Lingdi, CHEN Yujie, YANG Yang, DU Xinzhao
摘要: 传统的基于RGB视频的行为识别容易受到光线强度、观察视角等问题的影响。基于骨骼的行为识别受这些问题的影响较小,成为现在的主流方法之一。但目前基于骨骼信息的行为识别方法参数量较大,运算速度较慢。为了解决这些问题,提出一种多流融合的轻量级图卷积行为识别框架。首先,将融合人体关节、骨骼边、关节速度和骨骼速度的多种信息的数据输入到空间图卷积模块中;其次,在空间图卷积模块中加入了空间注意力机制来更好地提取各个关节之间的关系;最后,在时间卷积模块中使用了深度卷积和逐点卷积减少参数量。提出的网络与基线网络SGN相比,在NTU-RGB+D120数据集中,交叉视角评估下提高了2.3%,交叉设置评估下提高了1.9%,参数量减少了0.12×106个,从而验证了提出网络的有效性。
中图分类号:
[1]DENG M L,GAO Z D,LI L,et al.Overview of Human Behavior Recognition Based on Deep Learning[J].Computer Engineering and Applications,2022,58(13):14-26. [2]CAI Q,DENG Y B,LI H S,et al.Survey on Human Action Re-cognition Based on Deep Learning[J].Computer Science,2020,47(4):85-93. [3]SU B Y,WU H,SHENG M,et al.Accurate Hierarchical Hu-man Actions Recognition From Kinect Skeleton Data[J].IEEE Access,2019,7. [4]LI M H,XU H J,SHI L X,et al.Multi-person Activity Recognition Based on Bone Keypoints Detection[J].Computer Science,2021,48(4):138-143. [5]JIANG Q Y,WU X J,XU T Y.M2FA:multi-dimensional feature fusion attention mechanism for skeleton-based action recognition[J].Journal of Image and Graphics,2022,27(8):2391-2403. [6]LEE J,LEE M,LEE D,et al.Hierarchically Decomposed GraphConvolutional Networks for Skeleton-Based Action Recognition[J].arXiv:2208.10741,2022. [7]DUAN H,ZHAO Y,XIONG Y,et al.Omni-sourced webly-supervised learning for video recognition[C]//European Confe-rence on Computer Vision.Cham:Springer,2020:670-688. [8]ATEFE A,ALI N,EBRAHIMI M M.Sparse Deep LSTMs with Convolutional Attention for Human Action Recognition[J].SN Computer Science,2021,2(3). [9]CHEN Y,ZHANG Z,YUAN C,et al.Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2021:13359-13368. [10]LI C,ZHONG Q,XIE D,et al.Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation[J].arXiv:1804.06055,2018. [11]DU Y,WANG W,WANG L.Hierarchical recurrent neural network for skeleton based action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1110-1118. [12]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Thirty-se-cond AAAI Conference on Artificial Intelligence.2018. [13]SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12026-12035. [14]QIN Z Y,LIU Y,JI P,et al.Fusing Higher-Order Features in Graph Neural Networks for Skeleton-Based Action Recognition[J].arXiv:2015.01563,2022. [15]CHENG K,ZHANG Y,HE X,et al.Skeleton-based action re-cognition with shift graph convolutional network[C]//Procee-dings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2020. [16]LIU Z,ZHANG H,CHEN Z,et al.Disentangling and unifying graph convolutions for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:143-152. [17]DUAN H,ZHAO Y,CHEN K,et al.Revisiting skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2022:2969-2978. [18]ZHANG P,LAN C,ZENG W,et al.Semantics-guided neuralnetworks for efficient skeleton-based human action recognition[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2020. [19]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017. [20]SANDLER M,HOWARD A,ZHU M,et al.Mobilenetv2:Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4510-4520. [21]HOWARD A,SANDLER M,CHU G,et al.Searching for MobileNetV3[J].arXiv:1905.02244,2019. [22]WANG Q,WU B,ZHU P,et al.ECA-Net:Efficient channel attention for deep convolutional neural networks[C]//Procee-dings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2020. [23]SHAHROUDY A,LIU J,NG T T,et al.NTU RGB+D:A Large Scale Dataset for 3D Human Activity Analysis[J].arXiv:1604.02808,2016. [24]LIU J,AMIR A,LISBOA P M,et al.NTU RGB+D 120:ALarge-Scale Benchmark for 3D Human Activity Understanding[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(10). [25]CHEN Y S,YA J,WEI W,et al.Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack lear-ning network[J].Pattern Recognition,2020,107. [26]SONG Y F,ZHANG Z,WANG L.Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons[J].arXiv:1905.06774,2019. [27]LI M S,CHEN S H,CHEN X,et al.Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition[J].arXiv:1904.12659,2019. [28]SONG Y F,ZHANG Z,SHAN C,et al.Richly Activated Graph Convolutional Network for Robust Skeleton-Based Action Recognition[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(5). [29]PENG W,SHI J,ZHAO G.Spatial temporal graph deconvolutional network for skeleton-based human action recognition[J].IEEE Signal Processing Letters,2021,28:244-248. |
|