Computer Science ›› 2020, Vol. 47 ›› Issue (1): 117-123.doi: 10.11896/jsjkx.190100231

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Overview of Content-based Video Retrieval

HU Zhi-jun1,2,XU Yong3   

  1. (Guizhou Provincial Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,China)1;
    (College of Computer Science & Technology,Guizhou University,Guiyang 550025,China)2;
    (Harbin Institute of Technology(Shenzhen),Shenzhen,Guangdong 518055,China)3
  • Received:2019-01-28 Published:2020-01-19
  • About author:HU Zhi-jun,born in 1981,doctorial student,lecturer.His main research interests include fractal image compression,image and video retrieval;XU Yong,born in 1972,Ph.D,professor,Ph.D supervisor.His main research interests include pattern recognition,biometrics,machine learning and video analysis.
  • Supported by:
    This work was supported by the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (2018BDKFJJ001).

Abstract: Video is the medium with plenty of information,with the rise of short video APP such as vibrato,the number of videosin the network and database has increased dramatically and the method of manual labeling is no longer suitable for video retrieval.Video retrieval by extracting the spatial characteristics of video frames or temporal characteristics between frames and frames enables users to perform video search and categorization more objectively and efficiently.This paper summarized the content-based video retrieval algorithms,some classical algorithms of video retrieval,and the research and application of deep learning in content-based video retrieval.Finally,the development prospect of deep learning in video retrieval was anzlyzed.

Key words: Video retrieval, Convolutional neural network, Key frame, Feature extraction, Shot segmentation

CLC Number: 

  • TP391
[3]MEGRHIS,SOUIDENE W,BEGHDADIA.Spatio-temporal salient Feature extraction for Perceptual Content Based Video Retrieval∥IEEE 2013 Colour and Visual Computing Symposium (CVCS).Gjovik,Norway,2013:1-7.
[4]ZOLFAGHARI M,SINGH K,BROX T.ECO:Efficient Convolutional Network for Online Video Understanding[J].arXiv:1804.09066,2018.
[5]PAL G,RUDRAPAUL D,ACHARJEE S,et al.Video shot boundary detection:a review[C]∥Emerging ICT for Bridging the Future-Proceedings of the 49th Annual Convention of the Computer Society of India CSI Volume 2.India:Springer,Cham,2015:119-127.
[6]MARCHAND-MAILLET S.Content-based video retrieval:An overview[OL].
[7]SEBEN,LEW M S,ZHOU X,et al.The state of the art in image and video retrieval[C]∥International Conference on Image and Video Retrieval.Springer,Berlin,Heidelberg,2003:1-8.
[8]YUAN J,WANG H,XIAO L,et al.A formal study of shot boundary detection[J].IEEE Transactions on Circuits and Systems for Video Technology,2007,17(2):168-186.
[9]KIKUKAWA T,KAWAFUCHI S.Development of an automatic summary editing system for the audio visual resources[J].Transactions of the Institute of Electronics Information & Communication Engineers A,1992,75(43):204-212.
[10]LEE M S,YANG Y M,LEE S W.Automatic video parsing using shot boundary detection and camera operation analysis[J].Pattern Recognition,2001,34(3):711-719.
[11]ZHANG H J,KANKANHALLI A,SMOLIAR S W.Automatic partitioning of full-motion video[J].Multimedia Systems,1993,1(1):10-28.
[12]NAGASAKA A,TANAKA Y.Automatic scene-change detection method for video works[C]∥2nd Working Conference on Visual Database Systems.Japan Information Processing Society,1991:119-133.
[13]KWEON I S,HAN S,YOON K.A new technique for shot de- tection and key frames selection in histogram space[C]∥Proceedings of the 12th Workshop on Image Processing and Image Understanding.Korea,2000:475-479.
[14]YEO B L,LIU B.Rapid scene analysis on compressed video[J].IEEE Transactions on Circuits and Systems for Video Technology,1995,5(6):533-544.
[15]QIN J P,FU M S,TU Z Z,et al.Video shot boundary detection based on histogram change ratio[J].Computer Applications and Software,2011,28(4):17-20.
[16]KO K C,CHEON Y M,KIM G Y,et al.Video shot boundary detection algorithm[M]∥Computer Vision,Graphics and Image Processing.Springer,Berlin,Heidelberg,2006:388-396.
[17]CHANG H,ZHANG M.An algorithm of video Sshotboundary detection based on SVM[J].Graphic and Image,2016,7(20):73-77.
[18]LO C C,WANG S J.Video segmentation using a histogram-based fuzzy c-means clustering algorithm[J].Computer Stan-dards & Interfaces,2001,23(5):429-438.
[19]GYGLI M.Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks[C]∥2018 International Conference on Content Based Multimedia Indexing,CBMI 2018.La Rochelle,France,2018:1-4.
[20]HASSANIEN A,ELGHARIB M,SELIM A,et al.Large-scale,fast and accurate shot boundary detection through spatio-temporal convolutional neural networks[J].arXiv:1705.03281,2017.
[21]LI Y,LEE S H,YEH C H,et al.Techniques for movie content analysis and skimming:tutorial and overview on video abstraction techniques[J].IEEE Signal Processing Magazine,2006,23(2):79-89.
[22]WANG X J,DING H T,CHEN H X.A shot clustering based approach for scene segmentation[J].Chinese Journal of Image and Graphics,2007,12(12):2127-2130.
[23]FERMAN A,TEKALP A.Two-stage hierarchical video sum- mary extraction to match low-level user browsing preferences[J].IEEE Transactions on Multimedia,2003,5(2):244-256.
[24]SUN Z,JIA K,CHEN H.Video key frame extraction based on spatial-temporal color distribution[C]∥International Confe-rence on Intelligent Information Hiding and Multimedia Signal Processing.IEEE,2008:196-199.
[25]YU X D,WANG L,TIAN Q,et al.Multilevel video representation with application to keyframe extraction[C]∥Proceedings 10th International Multimedia Modelling Conference.IEEE,2004:117-123.
[26]ZHUANG Y,RUI Y,HUANG T S,et al.Adaptive key frame extraction using unsupervised clustering[C]∥Proceedings 1998 International Conference on Image Processing(ICIP98).IEEE,1998:866-870.
[27]WOLF W.Key frame selection by motion analysis[C]∥IEEE International Conference on Acoustics,Speech,& Signal Processing.1996:1228-1231.
[28]LIU T,ZHANG H J,QI F.A novel video key-frame-extraction algorithm based on perceived motion energy model[J].IEEE transactions on Circuits and Systems for Video Technology,2003,13(10):1006-1013.
[29]EJAZ N,BAIK S W,MAJEED H,et al.Multi-scale contrast and relative motion-based key frame extraction[J].EURASIP Journal on Image and Video Processing,2018,2018(1):40.
[30]HOANG N N,LEE G S,KIM S H,et al.A Real-time Multimodal Hand Gesture Recognition via 3D Convolutional Neural Network and Key Frame Extraction[C]∥Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence.ACM,2018:32-37.
[31]YAN X,GILANI S Z,QIN H,et al.Deep Keyframe Detection in Human Action Videos[J].arXiv:1804.10021,2018.
[32]CHUN Y D,KIM N C,JANG I H.Content-based image retrie- val using multiresolution color and texture features[J].IEEE Transactions on Multimedia,2008,10(6):1073-1084.
[33]LIN C Y,TSENG B L,NAPHADE M,et al.VideoAL:a novel end-to-end MPEG-7 video automatic labeling system[C]∥ In IEEE Intl.Conf.on Image Processing (ICIP).IEEE,2003,3:III-53.
[34]CHEUNG S C S,ZAKHOR A.Video similarity detection with video signature clustering[C]∥International Conference on Image Processing,2001.Thessaloniki,Greece:IEEE,2001:649-652.
[35]AMIR A,BERG M,CHANG S F,et al.IBM research TRECVID-2003 video retrieval system[OL].
[36]DYANA A,SUBRAMANIAN M P,DAS S.Combining features for shape and motion trajectory of video objects for efficient content based video retrieval[C]∥2009 Seventh International Conference on Advances in Pattern Recognition.Kolkata,India:IEEE,2009:113-116.
[37]POTLURI T,SRAVANI T,RAMAKRISHNA B,et al.Con- tent-Based Video Retrieval Using Dominant Color and Shape Feature[C]∥Proceedings of the First International Conference on Computational Intelligence and Informatics.Springer,Singapore,2017:373-380.
[38]FOLEY C,GURRIN C,JONES G J F,et al.TRECVid 2005 experiments at dublin city university[OL].
[39]JIANG Y G,NGO C W,YANG J.Towards optimal bag-of-features for object categorization and semantic video retrieval[C]∥Proceedings of the 6th ACM International Conference on Image and Video Retrieval.New York,NY,USA:ACM,2007:494-501.
[40]HORN B K P,SCHUNCK B G.Determining optical flow[J].Artificial Intelligence,1981,17(1/2/3):185-203.
[41]ZHONG D,CHANG S F.Spatio-temporal video search using the object based video representation[C]∥Proceedings of International Conference on Image Processing.Santa Barbara,CA,USA:IEEE,1997,1:21-24.
[42]DENGY,MUKHERJEE D,MANJUNATH B S.NeTra-V:Toward an object-based video representation[J].IEEE Transactions on Circuits and Systems for Video Technology,1998,8(5):616-627.
[43]BASHARAT A,ZHAI Y,SHAH M.Content based video matching using spatiotemporal volumes[J].Computer Vision and Image Understanding,2008,110(3):360-377.
[44]HSIEH J W,YU S L,CHEN Y S.Motion-based video retrieval by trajectory matching[J].IEEE Transactions on Circuits and Systems for Video Technology,2006,16(3):396-409.
[45]JUNG Y K,LEE K W,HO Y S.Content-based event retrieval using semantic scene interpretation for automated traffic surveillance[J].IEEE Transactions on Intelligent Transportation Systems,2001,2(3):151-163.
[46]LAI Y H,YANG C K.Video object retrieval by trajectory and appearance[J].IEEE Transactions on Circuits and Systems for Video Technology,2015,25(6):1026-1037.
[47]KUMAR G S N,REDDY V S K,KUMAR S S.High-Perfor- mance Video Retrieval Based on Spatio-Temporal Features[M]∥Microelectronics,Electromagnetics and Telecommunications.Springer,Singapore,2018:433-441.
[48]BRINDHA N,VISALAKSHI P.Bridging semantic gap between high-level and low-level features in content-based video retrieval using multi-stage ESN-SVM classifier[J].Sādhanā,2017,42(1):1-10.
[49]FENG Z H,ZHU Y B,LI W Q.Video near-duplicate retrieval based on deep learning[J].Computer Applications and Software,2018,35(1):160-163.
[50]DUAN L Y,YUAN J,TIAN Q,et al.Fast and robust video clip search using index structure[C]∥Proceedings of the 12th an-nual ACM international conference on Multimedia.New York,NY,USA:ACM,2004:756-757.
[51]FERMAN A M,TEKALP A M,MEHROTRA R.Robust color histogram descriptors for video segment retrieval and identification[J].IEEE Transactions on Image Processing,2002,11(5):497-508.
[52]DE ROOVER C,DE VLEESCHOUWER C,LEFEBVRE F, et al.Robust video hashing based on radial projections of key frames[J].IEEE Transactions on Signal processing,2005,53(10):4020-4037.
[53]COSKUNB,SANKUR B,MEMON N.Spatio-Temporal Transform Based Video Hashing[J].IEEE Transactions on Multimedia,2006,8(6):1190-1208.
[54]NIE X S,WANG S T,YIN Y L.Video hash learning based on feature fusion and Manhattan quantization[J].Journal of Nanjing University,2016,52(4):705-713.
[55]CHEN W,DING G,LIN Z,et al.Accelerated Manhattan hashing via bit-remapping with location information[J].Multimedia Tools and Applications,2017,76(2):2441-2466.
[56]LIONG V E,LU J,TAN Y P,et al.Deep video hashing[J].IEEE Transactions on Multimedia,2017,19(6):1209-1219.
[57]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks∥Advances in Neural Information Processing Systems25(NIPS 2012).Nevada,2012:1097-1105.
[58]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA:IEEE,2016:770-778.
[59]KORDOPATIS G,PAPADOPOULOS S,PATRAS I,et al. Near-duplicate video retrieval by aggregating intermediate cnn layers[C]∥International Conference on Multimedia Modeling.Springer,Cham,2017:251-263.
[60]PODLESNAYA A,PODLESNYY S.Deep learning based se- mantic video indexing and retrieval[C]∥Proceedings of SAI Intelligent Systems Conference.Springer,Cham,2016:359-372.
[61]DONG Y,LI J.Video retrieval based on deep convolutional neural network[C]∥Proceedings of the 3rd International Confe-rence on Multimedia Systems and Signal Processing.New York,NY,USA:ACM,2018:12-16.
[62]LIU X,ZHAO L,DING D,et al.Deep Hashing with Category Mask for Fast Video Retrieval[J].arXiv:1712.08315,2017.
[63]GU Y,MA C,YANG J.Supervised recurrent hashing for large scale video retrieval[C]∥Proceedings of the 2016 ACM on Multimedia Conference.New York,NY,USA:ACM,2016:272-276.
[64]ZHANGH,WANG M,HONG R,et al.Play and rewind:Optimizing binary representations of videos by self-supervised temporal hashing[C]∥Proceedings of the 24th ACM International Conference on Multimedia.New York,NY,USA:ACM,2016:781-790.
[1] DENG Yi-jiao, ZHANG Feng-li, CHEN Xue-qin, AI Qing, YU Su-zhe. Collaborative Attention Network Model for Cross-modal Retrieval [J]. Computer Science, 2020, 47(4): 54-59.
[2] CAI Qiang, DENG Yi-biao, LI Hai-sheng, YU Le, MING Shao-feng. Survey on Human Action Recognition Based on Deep Learning [J]. Computer Science, 2020, 47(4): 85-93.
[3] ZHAO Nan, PI Wen-chao, XU Chang-qiao. Video Recommendation Algorithm for Multidimensional Feature Analysis and Filtering [J]. Computer Science, 2020, 47(4): 103-107.
[4] WANG Kun-lun, LIU Wen-can, HE Xiao-hai, QING Lin-bo, WU Xiao-hong. Motion Feature Descriptor for Abnormal Behavior Detection [J]. Computer Science, 2020, 47(4): 119-124.
[5] PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi. Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network [J]. Computer Science, 2020, 47(4): 150-156.
[6] HU Jian-wei,XU Ming-yang,CUI Yan-peng. Improved TLS Fingerprint Enhance User Behavior Security Analysis Ability [J]. Computer Science, 2020, 47(3): 287-291.
[7] LIU Jun-qi,LI Zhi,ZHANG Xue-yang. Review of Maritime Target Detection in Visible Bands of Optical Remote Sensing Images [J]. Computer Science, 2020, 47(3): 116-123.
[8] CHEN Li-fu,LIU Yan-zhi,ZHANG Peng,YUAN Zhi-hui,XING Xue-min. Road Extraction Algorithm of Multi-feature High-resolution SAR Image Based on Multi-Path RefineNet [J]. Computer Science, 2020, 47(3): 156-161.
[9] LIU Yu-hong,LIU Shu-ying,FU Fu-xiang. Optimization of Compressed Sensing Reconstruction Algorithms Based on Convolutional Neural Network [J]. Computer Science, 2020, 47(3): 143-148.
[10] HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[11] XU Mao,HOU Jin,WU Pei-jun,LIU Yu-ling,LV Zhi-liang. Convolutional Neural Networks Based on Time-Frequency Characteristics for Modulation Classification [J]. Computer Science, 2020, 47(2): 175-179.
[12] WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[13] FU Xue-yang,SUN Qi,HUANG Yue,DING Xing-hao. Single Image De-raining Method Based on Deep Adjacently Connected Networks [J]. Computer Science, 2020, 47(2): 106-111.
[14] HE Chao-lei,BI Xiu-li,XIAO Bin. Zernike Moment Based Approach for Local Feature Detection [J]. Computer Science, 2020, 47(2): 135-142.
[15] QIAN Xiao-mei,LIU Jia-yong,CHENG Peng-sen. Distant Supervised Relation Extraction Based on Densely Connected Convolutional Networks [J]. Computer Science, 2020, 47(2): 157-162.
Full text



[1] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[2] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .
[3] SHI Chao, XIE Zai-peng, LIU Han and LV Xin. Optimization of Container Deployment Strategy Based on Stable Matching[J]. Computer Science, 2018, 45(4): 131 -136 .
[4] YANG Pei-an, WU Yang, SU Li-ya, LIU Bao-xu. Overview of Threat Intelligence Sharing Technologies in Cyberspace[J]. Computer Science, 2018, 45(6): 9 -18,26 .
[5] HU Qing-cheng, ZHANG Yong, XING Chun-xiao. K-clique Heuristic Algorithm for Influence Maximization in Social Network[J]. Computer Science, 2018, 45(6): 32 -35 .
[6] SUN Hai-feng and SONG Li-li. Intersection-relay-assisted Routing Scheme in VANETs[J]. Computer Science, 2018, 45(5): 75 -78 .
[7] ZHANG Jing, ZHOU An-min, LIU Liang, JIA Peng and LIU Lu-ping. Review of Crash Exploitability Analysis Methods[J]. Computer Science, 2018, 45(5): 5 -14, 23 .
[8] CHEN Sheng-ling ,SHEN Si-qi, LI Dong-sheng. Ensemble Learning Method for Imbalanced Data Based on Sample Weight Updating[J]. Computer Science, 2018, 45(7): 31 -37 .
[9] WU Zhong-zhong, LV Xin and LI Xin. Query Probability Based Dummy Location Selection Algorithm[J]. Computer Science, 2018, 45(5): 143 -146, 162 .
[10] ZU Hong-jiao, XIE Bin and MI Ju-sheng. Concept Construction and Attribute Reduction in Incomplete Decision Formal Contexts[J]. Computer Science, 2017, 44(9): 83 -87 .