Computer Science ›› 2015, Vol. 42 ›› Issue (11): 293-298.doi: 10.11896/j.issn.1002-137X.2015.11.060

Human Action Recognition by Visual Word Based on Local and Global Features

XIE Fei, GONG Sheng-rong, LIU Chun-ping and JI Yi   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Different from the method based on low-level features,the human action recognition based on visual word adds mid-level semantic information to features and then improves the accuracy of recognition.For complex background or dynamic scenes,the efficiency of visual words might deteriorate.We proposed a new method which is a combination of local and global feature to generate visual words.Firstly,our approach uses saliency map to detect the rectangles around human.And then inside these rectangles,3D-SIFT will be calculated around interest points detected from dynamic threshold matrix to describe local features.We also added HOOF to describe the global motion information.These visual words provide the important semantic information in the video such as brightness contrast,motion information,etc.The performance of this method in action recognition can be improved 6.4% on KTH dataset and 6.5% on UCF dataset compared with state-of-the-art methods.The experiment results also indicate that our visual dictionary has more advantages in both simple background and dynamic scene than others.

Key words: Visual words,Saliency map,3D-SIFT,Dynamic threshold matrix,HOOF

