融合三维人脸动态信息和光流信息的人脸表情识别

doi:10.11896/jsjkx.230700210

Abstract

Abstract: Facial expression recognition has achieved excellent results in static images,but when these methods are applied to vi-deos or image sequences,their accuracy and robustness are often affected.Traditional methods cannot usually recognize facial expressions based on spatial information and optical flow information.However,these auxiliary recognition information are all two-dimensional information,without considering that facial expression changes are a three-dimensional change process.In order to fully mine the deep semantic information of facial expression recognition,this paper proposes a fusion expression recognition method based on the combination of 3D facial dynamic information and optical flow information.This method constructs a multi stream convolutional neural network based on facial depth images,optical flow images,and RGB images,and integrates information from three modalities for facial expression recognition.The proposed method has been fully validated on CAER and RAVDESS datasets,and experimental results show that it outperforms current mainstream methods in facial expression recognition performance,which proves its effectiveness.

Key words: Facial expression recognition, Multi-stream convolutional neural network, 3D facial dynamic information, Optical flow information

CLC Number:

TP391.41

ZHANG Huazhong, PAN Yuekai, TU Xiaoguang, LIU Jianhua, XU Luopeng, ZHOU Chao. Facial Expression Recognition Integrating 3D Facial Dynamic Information and Optical Flow Information[J].Computer Science, 2024, 51(6A): 230700210-7.

References

[1]XU F,ZHANG J,WANG J Z.Microexpression identificationand categorization using a facial dynamics map[J].IEEE Transa-ctions on Affective Computing,2017,8(2):254-267.
[2]MA H Y,AN G Y,RUAN Q Q.Micro expression recognition described by the average optical flow direction histogram[J].Journal of Signal Processing,2018,34(3):279-288.
[3]WANG Y,WANG F,JIA H R,et al.Microexpression recognition combined with facial key points and optical flow features[J].Laser Journal,2023,44(5):72-77.
[4]SIMONYAN K,ZISSERMAN A.Two-stream convolutionalnetworks for action recognition in videos[J].Advances in Neural Information Processing Systems,2014,27.568-576.
[5]FERNANDO B,GOULD S.Learning end-to-end video classification with rank-pooling[C]//International Conference on Machine Learning.PMLR,2016:1187-1196.
[6]ZOLFAGHARI M,SINGH K,BROXT.Eco:Efficient convolutional network for online video understanding[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:695-712.
[7]WANG L,XIONG Y,WANG Z,et al.Towards good practices for very deep two-stream convnets[J].arXiv:1507.02159,2015.
[8]AGHAMALEKI J A,ASHKANI CHENARLOGH V.Multi-stream CNN for facial expression recognition in limited training data[J].Multimedia Tools and Applications,2019,78(16):22861-22882.
[9]ZHU X,LIU X,LEI Z,et al.Face alignment in full pose range:A 3d total solution[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,41(1):78-92.
[10]LEE J,KIM S,KIM S,et al.Context-aware emotion recognition networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:10143-10152.
[11]ZHANG W,ZHANG Y,MA L,et al.Multimodal learning forfacial expression recognition[J].Pattern Recognition,2015,48(10):3191-3202.
[12]FENG D,REN F.Dynamic Facial Expression Recognition based on Two-Stream-CNN with LBP-TOP[C]//2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems(CCIS).IEEE,2018.
[13]ZHOU P,HAN X,MORARIU V I,et al.Two-stream neural networks for tampered face detection[C]//2017 IEEE Confe-rence on Computer Vision and Pattern Recognition Workshops(CVPRW).IEEE,2017:1831-1839.
[14]LU B,ZHOU J,WANG Q,et al.Fusion-based color and depth image segmentation method for rocks on conveyor belt[J].Mi-nerals Engineering,2023,199:108107.
[15]XING H,YANG J,XIAO Y.Learning dynamic relationship between joints for 3D hand pose estimation from single depth map[J].Journal of Visual Communication and Image Representation,2023,92:103803.
[16]JIANG H,ZHANG Q,NIE Y,et al.Learning Multi-Scale Deep Image Prior for High-Quality Unsupervised Image Denoising[J].Computer Graphics Forum.2022,41(7):323-334.
[17]NIU W,ZHAO Y,YU Z,et al.Research on a face recognition algorithm based on 3D face data and 2D face image matching[J].Journal of Visual Communication and Image Representation,2023,91:103757.
[18]LIVINGSTONE S R,RUSSO F A.The Ryerson Audio-VisualDatabase of Emotional Speech and Song(RAVDESS):A dyna-mic,multimodal set of facial and vocal expressions in North American English[J].PloS one,2018,13(5):e0196391.
[19]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[20]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,20124.
[21]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[22]LUNA-JIMÉNEZ C,GRIOL D,CALLEJAS Z,et al.Multimodal emotion recognition on ravdess dataset using transfer learning[J].Sensors,2021,21(22):7665.
[23]LUNA-JIMÉNEZ C,KLEINLEIN R,GRIOL D,et al.A pro-posal for multimodal emotion recognition using aural transfor-mers and action units on RAVDESS dataset[J].Applied Sciences,2021,12(1):327.
[24]KANANI C S,GILL K S,BEHERAS,et al.Shallow over Deep Neural Networks:A Empirical Analysis for Human Emotion Classification Using Audio Data[C]//5th International Conference on Internet of Things and Connected Technologies(ICIoTCT).2020.Cham:Springer International Publishing,2021:134-146.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Facial Expression Recognition Integrating 3D Facial Dynamic Information and Optical Flow Information

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 10

Metrics

Comments

Recommended 0

[1]	JIANG Bin,GAN Yong,ZHANG Huan-long,ZHANG Qiu-wen. Survey on Non-frontal Facial Expression Recognition Methods [J]. Computer Science, 2019, 46(3): 53-62.
[2]	DU Jin, CHEN Yun-hua, ZHANG Ling, MAI Ying-chao. Energy-efficient Facial Expression Recognition Based on Improved Deep Residual Networks [J]. Computer Science, 2018, 45(9): 303-307.
[3]	HUANG Jian, LI Wen-shu and GAO Yu-juan. Research Advance of Facial Expression Recognition [J]. Computer Science, 2016, 43(Z11): 123-126.
[4]	PENG Hui. Research on Gabor Wavelet Transform Feature Recognition Robustness Based on Vector of Face [J]. Computer Science, 2014, 41(2): 308-311.
[5]	. [J]. Computer Science, 2009, 36(5): 262-264.
[6]	WANG Rong, MA Xi- rong （College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387）. [J]. Computer Science, 2009, 36(1): 231-233.
[7]	. [J]. Computer Science, 2007, 34(3): 213-215.
[8]	. [J]. Computer Science, 2006, 33(11): 200-204.
[9]	Ma XiRong;Liu Lin;Sang Jing. [J]. Computer Science, 2005, 32(8): 131-133.
[10]	. [J]. Computer Science, 2005, 32(11): 175-178.