计算机科学 ›› 2024, Vol. 51 ›› Issue (8): 232-241.doi: 10.11896/jsjkx.230600143
汪超1, 唐超1, 王文剑2, 张靖3
WANG Chao1, TANG Chao1, WANG Wenjian2, ZHANG Jing3
摘要: 深度学习网络对红外单一模态数据的学习表征能力具有一定的局限性,针对该问题,文中提出了基于多模态注意力网络的红外人体行为识别方法。由于深度学习网络模型无法直接对视频信息进行训练和分类,首先,通过预处理模块将得到的视频信息预处理成红外视图,再将得到的红外视图通过Sobel算子和基于L1范数的全变分光流法分别提取红外视图的边缘信息和光流信息得到边缘视图和光流视图;其次,将红外视图、边缘视图、光流视图分别输入融合注意力机制模块的三流网络中进行特征学习;然后,对三流网络中每个网络提取的多模态特征进行融合;最后,将融合得到的特征向量输入随机森林进行训练和分类。在公开数据集NTU RGB+D和自建数据集上进行实验,结果表明了所提方法具有不错的识别效果。
中图分类号:
[1]HERATH S,HARANDI M,PORIKLI F.Going Deeper intoAction Recognition:A Survey[J].Image and Vision Computing,2017,60:4-21. [2]PAN L L,CHEN Q K.Abnormal Behavior Detection ModelBased on Multi-sensor Sequence for Eldercare[J].Journal of Chinese Computer Systems,2022,43(9):1984-1991. [3]GUO W,WANG J,WANG S.Deep Multimodal Representation Learning:A Survey[J].IEEE Access,2019,7:63373-63394. [4]MAQSOOD M,NAZIR F,KHAN U,et al.Transfer Learning Assisted Classification and Detection of Alzheimer’s Disease Stages Using 3D MRI Scans[J].Sensors,2019,19(11):2645-2663. [5]PAUL A,MUKHERJEE D P,DAS P,et al.Improved Random Forest for Classification[J].IEEE Transactions on Image Processing,2018,27(8):4012-4024. [6]KONG Y,FU Y.Human Action Recognition and Prediction:A Survey[J].International Journal of Computer Vision,2022,130(5):1366-1401. [7]ALI S,BASHARAT A,SHAH M.Chaotic invariants for human action recognition[C]//2007 IEEE 11th International Confe-rence on Computer Vision.NJ:IEEE,2007:1-8. [8]JHUANG H,GALL J,ZUFFI S,et al.Towards understanding action recognition[C]//2013 14th IEEE International Confe-rence on Computer Vision.CA:IEEE Computer Society,2013:3192-3199. [9]AKULA A,SHAH A K,GHOSH R.Deep learning approach for human action recognition in infrared images[J].Cognitive Systems Research,2018,50:146-154. [10]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2323. [11]GAO C,DU Y,LIU J,et al.InfAR dataset:Infrared action re-cognition at different times[J].Neurocomputing,2016,212:36-47. [12]LIU Y,LU Z,LI J,et al.Global temporal representation based cnns for infrared action recognition[J].IEEE Signal Processing Letters,2018,25(6):848-852. [13]QUAN Z,CHEN Q,ZHAO K,et al.Knowledge Distillation forAction Recognition Based on RGB and Infrared Videos[C]//18th International Forum Digital TV and Wireless Multimedia Communications(IFTC 2021).Singapore:Springer Singapore,2022:18-29. [14]DE BOISSIERE A M,NOUMEIR R.Infrared and 3d skeleton feature fusion for rgb-d action recognition[J].IEEE Access,2020,8:168297-168308. [15]XIAO Y,ZHOU J.Overview of Image Edge Detection[J].Computer Engineering and Applications,2023,59(5):40-54. [16]LI C,QU Z.Review of image edge detection algorithms based on deep learning[J].Journal of Computer Applications,2020,40(11):3280-3288. [17]XIU C,YIN H,LIU Y.Image Segmentation of CV Model Combined with Sobel Operator[C]//2020 Chinese Control And Decision Conference(CCDC).NJ:IEEE,2020:4356-4360. [18]WANG A,LIU X.Vehicle license plate location based on im-proved Roberts operator and mathematical morphology[C]//2012 Second International Conference on Instrumentation,Measurement,Computer,Communication and Control.NJ:IEEE,2012:995-998. [19]LU X,ZHANG Y.Human body flexibility fitness test based on image edge detection and feature point extraction[J].Soft Computing,2020,24(12):8673-8683. [20]ZHANG C,GE L,CHEN Z,et al.Refined TV-l 1 optical flow estimation using joint filtering[J].IEEE Transactions on Multimedia,2019,22(2):349-364. [21]WANG S H,FERNANDES S L,ZHU Z,et al.AVNC:attention-based VGG-style network for COVID-19 diagnosis by CBAM[J].IEEE Sensors Journal,2021,22(18):17431-17438. [22]HARA K,KATAOKA H,SATOH Y.Can spatiotemporal 3dcnns retrace the history of 2d cnns and imagenet? [C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.CA:IEEE Computer Society,2018:6546-6555. [23]TONG A,TANG C,WANG W.Semi-supervised Action Recognition from Temporal Augmentation Using Curriculum Learning[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(3):1305-1319. [24]LIU W B,ZOU Z Y,XING W W.Feature fusion methods in pattern classification[J].Journal of Beijing University of Posts and Telecommunications,2017,40(4):1-8. [25]WANG H,KLÄSER A,SCHMID C,et al.Dense Trajectories and Motion Boundary Descriptors for Action Recognition[J].International Journal of Computer Vision,2013,103(1):60-79. [26]YANG J,YANG J Y,ZHANG D,et al.Feature fusion:parallel strategy vs.serial strategy[J].Pattern Recognition,2003,36(6):1369-1381. [27]DONG X,YU Z,CAO W,et al.A survey on ensemble learning[J].Frontiers of Computer Science,2019,14(2):241-258. [28]NALEPA J,KAWULOK M.Selecting training sets for support vector machines:a review[J].Artificial Intelligence Review,2018,52(2):857-900. [29]OGUNLEYE A,WANG Q G.XGBoost model for chronic kidney disease diagnosis[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2019,17(6):2131-2140. |
|