复杂场景下的人体行为识别研究新进展

doi:10.11896/j.issn.1002-137X.2014.12.001

Abstract

Abstract: Human action recognition has become a hot and difficult spot currently in the domain of computer vision.The framework of mainstream methods includes visual feature detection,action representation and action classification.Action recognition in simple scenes has been implemented at present.This paper introduced in detail the research of human action recognition in realistic scenes from perspectives of research scope,feature detection,and action modeling.Unlike several recent published researches,we analyzed the state-of-the-arts and advances of this field,such as pose estimation,sparse coding based or deep learning based human action representation etc.Finally,the problems,difficulties as well as possible solutions were discussed.

Key words: Human action recognition,Visual feature detection,Action representation,Computer vision

LEI Qing,CHEN Duan-sheng and LI Shao-zi. Advances on Human Action Recognition in Realistic Scenes[J].Computer Science, 2014, 41(12): 1-7.

References

[1] 徐光祐,曹媛媛.动作识别与行为理解综述[J].中国图象图形学报,2009,14(2):189-195
[2] 黎洪松,李达.人体运动分析研究的若干新进展[J].模式识别与人工能,2009,22(1):70-78
[3] Yamato J,Ohya J,Ishii K.Recognizing human action in time-sequential images using hidden Markov model[C]∥Proceedings of the Conference on Computer Vision and Pattern Recognition.1992:379-385
[4] Bobick A F,Davis J W.The recognition of human movement using temporal templates[J].IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),2001,3(3):257-267
[5] Blank M,Gorelick L,Shechtman E,et al.Actions as space-time shapes[C]∥Proceedings of the International Conference On Computer Vision (ICCV’05).2005:1395-1402
[6] Polana R,Nelson R C.Detection and recognition of periodic,nonrigid motion[J].International Journal of Computer Vision (IJCV),1997,23(3):261-282
[7] Efros A A,Berg A C,Mori G,et al.Recognizing action at a distance[C]∥ Proceedings of the International Conference on Computer Vision (ICCV’03).2003:726-733
[8] Dalal N,Triggs B.Histograms of oriented gradients for human detection[C]∥Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’05).2005,1:886-893
[9] Wang Yang,Mori G.Learning a discriminative hidden part model for human action recognition[C]∥Advances in Neural Information Processing Systems (NIPS).2008,1:1721-1728
[10] Laptev I,Marszaek M,Cordelia Schmid,et al.Learning realistic human actions from movies[C]∥Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’08).2008:1-8
[11] Johansson G.Visual Perception of Biological Motion and a Model for its Analysis[J].Perception and Psychophysics,1973,14(2):210-211
[12] Felzenszwalb P F,Girshick R B,McAllester D.Cascade ObjectDetection with Deformable Part Models[C]∥Computer Vision and Pattern Recognition (CVPR).2010:2241-2248
[13] Yao A,Gall J,Gool L V.Coupled Action Recognition and Pose Estimation from Multiple Views[J].International Journal of Computer Vision (IJCV),2012,0(1):16-37
[14] Yao Bang-peng,Li Fei-fei.Modeling mutual context of objectand human pose in human-object interaction activities[J].IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),2012,34(9):1691-1703
[15] Packer B,Saenko K,Koller D.A combined pose,object,and feature model for action understanding[C]∥Computer Vision and Pattern Recognition (CVPR).2012:1378-1385
[16] Yao A,Gall J,Fanelli G,et al.Does Human Action Recognition Benefit from Pose Estimation?[C]∥Proceedings of the British Machine Vision Conference.BMVA Press,2011:1-11
[17] Laptev I,Caputo B,Schuldt C,et al.Local velocity-adapted motion events for spatio-temporal recognition [J].Computer Vision and Image Understanding (CVIU),2007,108(3):207-229
[18] Laptev I,Lindeberg T.Space-time interest points[C]∥Procee-dings of the International Conference on Computer Vision (ICCV’03).Nice,France,2003,1:432-439
[19] Dollar P,Rabaud V,Cottrell G,et al.Behavior recognition viasparse spatio-temporal features[C]∥Proceedings of the International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.2005:65-72
[20] Scovanner P,Ali S,Shah M.A 3-dimensional SIFT descriptor and its application to action recognition[C]∥Proceedings of the International Conference on Multimedia (MultiMedia’07).Augsburg,Germany,2007:357-360
[21] Oikonomopoulos A,Patras I,Pantic M.Spatio-temporal salient points for visual recognition of human actions[J].IEEE Tran-sactions on Systems Man And Cybernetics (SMC),2006,6(3):710-719
[22] Willems G,Tuytelaars T,Van Gool L J.An efficient dense and scaleinvariant spatio-temporal interest point detector[C]∥Proceedings of the European Conference on Computer Vision (ECCV’08).2008:650-663
[23] Sun Ju,Wu Xiao,Yan Shui-cheng,et al.Hierarchical spatio-temporal context modeling for action recognition[C]∥Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’2009).2009:1-8
[24] Gupta A,Kembhavi A,Davis L S.Observing human-object interactions:using spatial and functional compatibility for recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),2009,31(10):1775-1789
[25] Candes E J,Wakin M B.An introduction to compressive sam-pling[J].IEEE Signal Processing Magzine,2008,5(2):21-30
[26] Wright J,Ma Y,Mairal J,et al.Sparse Representation for Computer Vision and Pattern Recognition[J].Proceeding of the IEEE,2010,98(6):1031-1044
[27] Davenport M A,Duarte M F,Eldar Y C,et al.Introduction to compressed sensing.2011.http://www.dfg-spp1324.de/download/preprints/preprint093.pdf
[28] 焦李成,杨淑媛,刘芳,等.压缩感知回顾与展望[J].电子学报,2010,39(7):1651-1662
[29] Wright J,Yang A Y,Ganesh A,et al.Robust face recognition via sparse representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,1(2):210-227
[30] Guha T,Ward R K.Learning Sparse Representations for Hu-man Action Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,4(8):1576-1588
[31] Castrodad A,Sapiro G,Castrodad A,et al.Sparse Modeling of Human Actions from Motion Imagery[J].International Journal of Computer Vision,2012,0(1):1-15
[32] Bengio Y.Learning Deep Architectures for AI[J].Foundations and Trends in Machine Learning,2009,2(1):1-127
[33] Ji Shui-wang,Xu Wei,Yang Ming,et al.3D Convolutional Neural Networks for Human Action Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,5(1):221-231
[34] Le Q V,Zou W Y,Yeung S Y,et al.Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis[C]∥Computer Vision and Pattern Recognition (CVPR).2011:3361-3368
[35] Farabet C,Couprie C,Najman L,et al.Learning HierarchicalFeatures for Scene Labeling[J].IEEE Transactions on Pattern Analysis and Machine Intelligence.Preprints,2013,35(8):1915-1929
[36] Rodriguez M,Ahmed J,Shah M.Action MACH:A Spatio-temporal Maximum Average Correlation Height Filter for Action Recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Anchorage,Alaska,UCF Sports,2008:1-8
[37] Liu Jin-gen,Luo Jie-bo,Shah M.Recognizing Realistic Actions from Videos "in the Wild"[J].IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Miami,2009
[38] Marzalek M,Laptev I,Schmid C.Actions in context[C]∥CVPR.2009:2929-2936
[39] Gilbert A,Illingworth J,Bowden R.Action Recognition UsingMined Hierarchical Compound Features[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(5):883-897
[40] Roshtkhari M J,Levine M D.A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example[C]∥Proc.of the conference on computer and robot vision (CRV).2012:182-189
[41] Yao Bang-peng,Li Fei-fei.Recognizing Human-Object Interac-tions in Still Images by Modeling the Mutual Context of Objects and Human Poses[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,4(9):1691-1703
[42] Desai C,Ramanan D.Detecting Actions,Poses,and Objects with Relational Phraselets[C]∥European Conference on Computer Vision.2012:158-172
[43] Shotton J,Fitzgibbon A W,Cook M,et al.Real-time human pose recognition in parts from single depth images[J].Machine Learning for Computer Vision,2013,1:193-135
[44] Wang Jiang,Liu Zi-cheng,Wu Ying,et al.Mining actionlet ensemble for action recognition with depth cameras[R].Microsoft Research,2012
[45] Turaga P,Veeraraghavan A,Chellappa R.Unsupervised viewand rate invariant clustering of video sequences[J].Computer Vision and Image Understanding (CVIU),2009,3(3):353-371
[46] Rodriguez M D,Ahmed J,Shah M.Action MACH:a spatio-temporal maximum average correlation height filter for action recognition[C]∥Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’08).Anchorage,2008:1-8
[47] Brand M.Coupled hidden Markov models for modeling interacting processes[J].Daa,1997
[48] Nguyen N T,Phung D Q,Venkatesh S,et al.Learning and de-tecting activities from movement trajectories using the hierarchical hidden Markov models[C]∥IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2005,2:955-960
[49] Park S,Aggarwal J K.A hierarchical Bayesian network for event recognition of human actions and interactions[J].Multimedia Systems,2004,10(2):164-179
[50] Muncaster J,Ma Y.Activity recognition using dynamic Bayesian networks with automatic state selection[C]∥IEEE Workshop on Motion and Video Computing (WMVC).2007:30-37

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Advances on Human Action Recognition in Realistic Scenes

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0