Computer Science ›› 2021, Vol. 48 ›› Issue (3): 50-59.doi: 10.11896/jsjkx.210100210

Special Issue: Advances on Multimedia Technology

• Advances on Multimedia Technology • Previous Articles     Next Articles

Survey on Video-based Face Recognition

BAI Zi-yi, MAO Yi-rong , WANG Rui-ping   

  1. Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS),Institute of Computing Technology,Chinese Academyof Sciences,Beijing 100190,China
    School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2020-12-05 Revised:2021-01-27 Online:2021-03-15 Published:2021-03-05
  • About author:BAI Zi-yi,born in 1997,postgraduate.Her main research interests include computer vision and pattern recognition.
    WANG Rui-ping,born in 1981,Ph.D,professor,Ph.D supervisor,is a senior member of China Computer Federation.His main research interests include computer vision and pattern recognition.
  • Supported by:
    National Natural Science Foundation of China (61922080,U19B2036,61772500).

Abstract: Face recognition is a key technology in the field of biometrics,which has been widely concerned by researchers in the past decades.Video-based face recognition task refers specifically to extract the key information of human faces from a video to complete the personal identification.Compared with the image-based face recognition task,the changing patterns of faces in videos are much more diverse,and there are great differences among the whole video frames as well.Current research focuses on how to extract the key features of faces from lengthy videos.Firstly,this paper introduces the research value and challenges of video-based face recognition.Then,the developing venation of the current research work is explored.Based on the video modeling manners,traditional image set based methods are divided into four categories:linear subspace modeling,affine subspace modeling,nonlinear manifold modeling and statistical modeling.Besides,the methods based on image fusion under the background of deep learning are also introduced.This paper also briefly reviews existing datasets for video-based face recognition and the commonly used performance metrics.Finally,gray features and deep features are used to evaluate the representative works on YTC dataset and IJB-A dataset.Experimental results show that deep neural network can extract robust features of each frame after being trained with large-scale data,which greatly improves the performance of video-based face recognition.Moreover,the effective vi-deo modeling can help to identify the potential human face changing patterns.Therefore,more discriminative information can be found from the large number of samples contained in the video sequence,and the inference of noise samples can be eliminated,which suggests the advantages of video-based face recognition to be applied to a large range of practical application scenarios.

Key words: Deep learning, Image set modeling, Manifold learning, Subspace learning, Video-based face recognition

CLC Number: 

  • TP391
[1]CHEN S,MAU S,HARANDI M T,et al.Face recognition from still images to video sequences:a local-feature-based framework[J].Journal on Image and Video Processing,2011,2011(1):1-14.
[2]LI Z,ZHANG J,ZHANG K,et al.Visual tracking with weighted adaptive local sparse appearance model via spatio-temporal context learning[J].IEEE Transactions on Image Processing,2018,27(9):4478-4489.
[3]SIROVICH L,KIRBY M.Low-dimensional procedure for thecharacterization of human faces[J].Josa A,1987,4(3):519-524.
[4]OJALA T,PIETIKAINEN M,MAENPAA T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2002,24(7):971-987.
[5]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[6]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE,2005:886-893.
[7]KIM T K,KITTLER J,CIPOLLA R.Discriminative learning and recognition of image set classes using canonical correlations[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(6):1005-1018.
[8]HAMM J,LEE D D.Grassmann discriminant analysis:a uni-fying view on subspace-based learning[C]//Proceedings of the 25thInternational Conference on Machine Learning.2008:376-383.
[9]HARANDI M T,SALZMANN M,JAYASUMANA S,et al.Expanding the family of grassmannian kernels:An embedding perspective[C]//European Conference on Computer Vision.Springer,Cham,2014:408-423.
[10]HARANDI M T,SANDERSON C,SHIRAZI S,et al.Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2011:2705-2712.
[11]HUANG Z,WANG R,SHAN S,et al.Projection metric lear-ning on Grassmann manifold with application to video based face recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern recognition.2015:140-149.
[12]CEVIKALP H,TRIGGS B.Face recognition based on imagesets[C]//2010 IEEE Computer Society Conference on Compu-ter Vision and Pattern Recognition.IEEE,2010:2567-2573.
[13]HU Y,MIAN A S,OWENS R.Sparse approximated nearest points for image set classification[C]//2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE,2011:121-128.
[14]YANG M,ZHU P,VAN GOOL L,et al.Face recognition based on regularized nearest points between image sets[C]//2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).IEEE,2013:1-7.
[15]ZHU P,ZHANG L,ZUO W,et al.From point to set:Extend the learning of distance metrics[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:2664-2671.
[16]WANG R,SHAN S,CHEN X,et al.Manifold-manifold distance with application to face recognition based on image set[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2008:1-8.
[17]WANG R,CHEN X.Manifold discriminant analysis[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:429-436.
[18]CUI Z,SHAN S,ZHANG H,et al.Image sets alignment for video-based face recognition[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:2626-2633.
[19]CHEN S,SANDERSON C,HARANDI M T,et al.Improved image set classification via joint sparse approximated nearest subspaces[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2013:452-459.
[20]SHANKHNAROVICH G,FISHER J W,DARRELL T.Facerecognition from long-term observations[C]//European Confe-rence on Computer Vision.Berlin,Heidelberg:Springer,2002:851-865.
[21]WANG W,WANG R,HUANG Z,et al.Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:2048-2057.
[22]WANG R,GUO H,DAVIS L S,et al.Covariance discriminative learning:A natural and efficient approach to image set classification[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:2496-2503.
[23]WANG W,WANG R,SHANS,et al.Discriminative covariance oriented representation learning for face recognition with image sets[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5599-5608.
[24]HUANG Z,WANG R,SHAN S,et al.Log-euclidean metriclearning on symmetric positive definite manifold with application to image set classification[C]//International Conference on Machine Learning.2015:720-729.
[25]HASSNER T,MASI I,KIM J,et al.Pooling faces:Template based face recognition with pooled face images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2016:59-67.
[26]RAO Y,LIN J,LU J,et al.Learning discriminative aggregation network for video-based face recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:3781-3790.
[27]SHI Y,JAIN A K.Probabilistic face embeddings[C]//Procee-dings of the IEEE International Conference on Computer Vision.2019:6902-6911.
[28]LIU Y,YAN J,OUYANG W.Quality aware network for set to set recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:5790-5799.
[29]YANG J,REN P,ZHANG D,et al.Neural aggregation network for video face recognition[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:4362-4371.
[30]ZHANG M,SONG G,ZHOU H,et al.Discriminability distillation in group representation learning[C]//European Confe-rence on Computer Vision.Springer,Cham,2020:1-19.
[31]ZHONG Y,ARANDJELOVIC R,ZISSERMAN A.GhostVLAD for set-based face recognition[C]//Asian Conference on Computer Vision.Springer,Cham,2018:35-50.
[32]ARANDJELOVIC R,GRONAT P,TROII A,et al.NetVLAD:CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5297-5307.
[33]LIU X,VIJAYA K B V K,YANG C,et al.Dependency-aware attention control for unconstrained face recognition with image sets[C]//Proceedings of the European Conference on Computer Vision.2018:548-565.
[34]XIE W,SHEN L,ZISSERMAN A.Comparator networks[C]//Proceedings of the European Conference on Computer Vision.2018:782-797.
[35]GONG S,SHI Y,KALKA N D,et al.Video face recognition:Component-wise feature aggregation network (c-fan)[C]//2019 International Conference on Biometrics.IEEE,2019:1-8.
[36]LIU X,GUO Z,LI S,et al.Permutation-invariant feature re-structuring for correlation-aware image set-based recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:4986-4996.
[37]LEE K C,HO J,YANG M H,et al.Video-based face recognition using probabilistic appearance manifolds[C]//2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Proceedings.IEEE,2003:I/313-I/320.
[38]LEE K C,HO J,YANG M H,et al.Visual tracking and recognition using probabilistic appearance manifolds[J].Computer Vision and Image Understanding,2005,99(3):303-331.
[39]MESSER K,MATAS J,KITTLER J,et al.XM2VTSDB:The extended M2VTS database[C]//Second International Confe-rence on Audio and Video-based Biometric Person Authentication.1999:965-966.
[40]FATHY M E,PATEL V M,CHELLAPPA R.Face-based active authentication on mobile devices[C]//2015 IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2015:1687-1691.
[41]GOH R,LIU L,LIU X,et al.The CMU face in action (FIA) database[C]//InternationalWorkshop on Analysis and Modeling of Faces and Gestures.Berlin,Heidelberg:Springer,2005:255-263.
[42]WONG Y,CHEN S,MAU S,et al.Patch-based probabilistic ima-ge quality assessment for face selection and improved video-based face recognition[C]//CVPR 2011 WORKSHOPS.IEEE,2011:74-81.
[43]PHILLIPS P J,FLYNN P J,BEVERIDGE J R,et al.Overview of the multiple biometrics grand challenge[C]//International Conference on Biometrics.Berlin,Heidelberg:Springer,2009:705-714.
[44]HUANG Z,SHAN S,WANG R,et al.A benchmark and comparative study of video-based face recognition on cox face database[J].IEEE Transactions on Image Processing,2015,24(12):5967-5981.
[45]BEVERIDGE J R,PHILLIPS P J,BOLME D S,et al.The challenge of face recognition from digital point-and-shoot cameras[C]//2013 IEEE Sixth International Conference on Biometrics:Theory,Applications and Systems.IEEE,2013:1-8.
[46]KALKA N D,MAZE B,DUNCAN J A,et al.IJB-S:IARPA Janus surveillance video benchmark[C]//2018 IEEE 9th International Conference on Biometrics Theory,Applications and Systems.IEEE,2018:1-9.
[47]KIM M,KUMAR S,PAVLOVIC V,et al.Face tracking and recognition with visual constraints in real-world videos[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2008:1-8.
[48]WOLF L,HASSNER T,MAOZ I.Face recognition in unconstrained videos with matched background similarity[C]//CVPR 2011.IEEE,2011:529-534.
[49]LIU L,ZHANG L,LIU H,et al.Toward large-population face identification in unconstrained videos[J].IEEE Transactions on Circuits and Systems for Video Technology,2014,24(11):1874-1884.
[50]KLARE B F,KLEIN B,TABORSKY E,et al.Pushing the frontiers of unconstrained face detection and recognition:Iarpa janus benchmark-a[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1931-1939.
[51]WHITELAM C,TABORSKY E,BLANTON A,et al.Iarpa janus benchmark-b face dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017:90-98.
[52]MAZE B,ADAMS J,CUNCAN J A,et al.Iarpa janus benchmark-c:Face dataset andprotocol[C]//2018 International Conference on Biometrics.IEEE,2018:158-165.
[53]BAMSAL A,NANDURI A,CASTILLO C D,et al.Umdfaces:An annotated face dataset for training deep networks[C]//2017 IEEE International Joint Conference on Biometrics.IEEE,2017:464-473.
[54]BAMSAL A,CASTILLO C,RANJAN R,et al.The do’s anddon’ts for cnn-based face verification[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2017:2545-2554.
[55]LIU Y,PENG B,SHI P,et al.iqiyi-vid:A large dataset for multi-modal person identification[J].arXiv:1811.07548,2018.
[56]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[57]CAO Q,SHEN L,XIE W,et al.Vggface2:A dataset for recognising faces across pose and age[C]//2018 13th IEEE International Conference on Automatic Face & Gesture Recognition.IEEE,2018:67-74.
[58]YI D,LEI Z,LIAO S,et al.Learning face representation from scratch[J].arXiv:1411.7923,2014.
[59]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015.
[60]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[5] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[14] CHU Yu-chun, GONG Hang, Wang Xue-fang, LIU Pei-shun. Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 [J]. Computer Science, 2022, 49(6A): 337-344.
[15] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!