Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240900046-8.doi: 10.11896/jsjkx.240900046

• Artificial Intelligence • Previous Articles     Next Articles

FB-TimesNet:An Improved Multimodal Emotion Recognition Method Based on TimesNet

LI Weirong, YIN Jibin   

  1. Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650031,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:LI Weirong,born in 2000,postgra-duate,is a member of CCF(No.V2902G).His main research interests include human-computer interaction,emotion recognition and deep learning.
    YIN Jibin,born in 1976,Ph.D,associate professor.His main research interests include human-computer interaction and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61741206).

Abstract: Aiming at the limitations such as single modality of information source,poor anti-interference,high computational cost,and small attention to temporal features in the field of emotion recognition,this paper propose a hybrid emotion recognition method FB-TimesNet based on the improvement of TimesNet for facial expression and body gesture.Firstly,the human body key-point coordinates of the video frames were extracted,and the facial key-point coordinates were used as the original information features respectively,with the change value of facial key-point coordinates relative to the natural state of the change values of facial keypoint coordinates relative to the natural state,and body posture keypoint coordinates as the raw information features of facial expression and body posture,thus reducing the dimensionality of the data and computational cost.Second,the periodic changes of the input data were captured using the fast Fourier variation,which transforms the one-dimensional data into two-dimensional data,and then two-dimensional convolution kernels were used to encode and extract spatio-temporal features for the two sets of features separately to enhance the characterization ability of the data.Finally,the fusion algorithm was used to dynamically allocate the weights of each modality to obtain the best fusion effect.In this paper,extensive comparative experiments have been conducted on two common sentiment datasets,and the experimental results show that FB-TimesNe improves the classification accuracy by 4.89% compared to the baseline model on the BRED dataset.

Key words: Video emotion recognition, Spatio-temporal features, Expression recognition, Body posture, Multimodal feature fusion

CLC Number: 

  • TP242
[1]TAO J,FAN C,LIAN Z,et al.Development of multimodal sentiment recognition and understanding[J].Journal of Image and Graphics,2024,29(6):1607-1627.
[2]DONG H,NIU Y,SUN Y,et al.Speech Emotion Recognition Based on Memory Capsules and Attention[J].Computer Engineering,2025,51(4):169-177.
[3]LEONG S C,TANG Y M,LAI C H,et al.Facial expression and body gesture emotion recognition:a systematic review on the use of visual data in affective computing[J].Computer Science Review,2023,48(5):100545.
[4]KUPPENS P,VERDUYN P.Looking at emotion regulationthrough the window of emotion dynamics[J].Psychological Inquiry,2015,26(1):72-79.
[5]HUANG K,LI J,CHENG S,et al.An efficient algorithm of facial expression recognition by TSG-RNN network[C]//MultiMedia Modeling:26th International Conference,MMM 2020,Daejeon,South Korea,Part II 26.Springer International Publishing,2020:161-174.
[6]LAMBA P S,VIRMANI D.CNN-LSTM-based facial expression recognition[C]//Proceedings of 3rd International Conference on Computing Informatics and Networks:ICCIN 2020.Springer Singapore,2021:379-389
[7]BHATTACHARYA U,RONCAL C,MITTALT,et al.Takeanemotion walk:Perceiving emotions from gaits using hierarchicalattention pooling and affective mapping[M]//European Conference on Computer Vision.Cham:Springer,2020:145-163.
[8]ZHAO S,JIA G,YANG J,et al.Emotion recognition frommultiple modalities:Fundamentals and methodologies[J].IEEE Signal Processing Magazine,2021,38(6):59-73.
[9]LIU M,LIU H,CHEN C.Enhanced skeleton visualization for view invariant human action recognition[J]Pattern Recognition,2017,68:346-362.
[10]CANAL F Z,MÜLLER T R,MATIAS J C,et al.A survey on facial emotion recognition techniques:A state-of-the-art literature review[J].Information Sciences,2022,582:593-617.
[11]EKMAN P,FRIESEN W.Facial action coding system(FACS):a technique for the measurement of facial action[M].Palo Alto,CA:Consulting Psychologists Press,1978.
[12]BORGWARDT K M,GRETTON A,RASCH M J,et al.In-tegrating structured biological data by kernel maximum mean discrepancy[J].Bioinformatics,2006,22(14):49-57.
[13]AGHABEIGI F,NAZARI S,OSATI ERAGHI N.An optimized facial emotion recognition architecture based on a deep convolutional neural network and genetic algorithm[J].Signal,Image and Video Processing,2024,18(2):1119-1129.
[14]FLISS I,ZEMZEMW.A novel PSO-ViT approach for facial emotion recognition[J].Computer Methods in Biomechanics and Biomedical Engineering:Imaging & Visualization,2024,11(7):2297016.
[15]WU Y,LI J.Multi-modal emotion identification fusing facial expression and EEG[J].Multimedia Tools and Applications,2023,82(7):10901-10919.
[16]FLISS I,ZEMZEM W.A novel PSO-ViT approach for facialGavrilescu,Mihai[C]//Proposed Architecture of Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition System Based on Facial Action Coding System.2014 10th International Conference on Communications(COMM).IEEE,2014.
[17]TIAN J,SHE Y.A visual-audio-based emotion recognition system integrating dimensional analysis[J].IEEE Transactions on Computational Social Systems,2022,10(6):3273-3282.
[18]WANG G,WANG Z,YANG G,et al.Survey of Artificial Emotion[J].Application Research of Computers,2006(11):7-11.
[19]SHEN Z,CHENG J,HU X,et al.Emotion recognition based on multi-view body gestures[C]//2019 IEEE International Conference on Image Processing(ICIP).IEEE,2019:3317-3321.
[20]FERDOUS A,BARI A H,GAVRILOVA M L.Emotion recognition from body movement[J].IEEE Access,2019(8):11761-11781.
[21]ZHOU T,GAO S,MEI Y,et al.Facial expressions and bodypostures emotion recognition based on convolutional attention network[C]//2021 International Conference on Computer,Information and Telecommunication Systems(CITS).IEEE,2021:1-5.
[22]WEI J,HU G,YANG X,et al.Learning facial expression and body gesture visual information for video emotion recognition[J].Expert Systems with Applications,2024,237:121419.
[23]MARTINEZ G H.Openpose:Whole-body pose estimation[D].Carnegie Mellon University,2019.
[24]WU H,HU T,LIU Y,et al.Timesnet:Temporal 2d-variationmodeling for general time series analysis[J].arXiv:2210.02186,2022.
[25]LUCEY P,COHN J F,KANADE T,et al.The extended cohn-kanade dataset(ck+):A complete dataset for action unit and.emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops.2010:94-101.
[26]FILNTISIS P P,EAHYMIOU N,KOUTRAS P,et al.Fusingbodyposture with facial expressions for joint recognition ofaffect in childrobot interaction[J].lEEE Roboties andautomation letters,2019,4(4):4011-4018
[27]ZHI J,SONG T,YU K,et al.Multi-attentionmadule for dy-namie fsei:l emotion recognitian[J].Infaraation,2022,13(5):207.
[28]NEWEL L,YANG K,DENG J.Stacked hourglass networks for human pose estimation[C]//European Conference on Computer Vision.Springer,2016:483-499.
[1] ZHANG Huazhong, PAN Yuekai, TU Xiaoguang, LIU Jianhua, XU Luopeng, ZHOU Chao. Facial Expression Recognition Integrating 3D Facial Dynamic Information and Optical Flow Information [J]. Computer Science, 2024, 51(6A): 230700210-7.
[2] RAO Yi, YUAN Bochuan, YUAN Yubo. Recognition Method of Online Classroom Interaction Based on Learner State [J]. Computer Science, 2024, 51(11A): 231200133-9.
[3] JIANG Sheng, ZHU Jianhong. Face Micro-expression Recognition Method Based on ME-ResNet [J]. Computer Science, 2024, 51(11A): 231000053-7.
[4] DUAN Xinran, WANG Mei, HAN Tianli, ZHOU Hongyu, GUO Junqi, JI Weixing, HUANG Hua. Perception and Analysis of Teaching Process Based on Video Understanding [J]. Computer Science, 2024, 51(10): 56-66.
[5] ZHOU Fengfan, LING Hefei, ZHANG Jinyuan, XIA Ziwei, SHI Yuxuan, LI Ping. Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion [J]. Computer Science, 2023, 50(8): 280-285.
[6] ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[7] LI Xing-ran, ZHANG Li-yan, YAO Shu-jing. Micro-expression Recognition Method Combining Feature Fusion and Attention Mechanism [J]. Computer Science, 2022, 49(2): 4-11.
[8] LIANG Zheng-you, HE Jing-lin, SUN Yu. Three-dimensional Convolutional Neural Network Evolution Method for Facial Micro-expression Auto-recognition [J]. Computer Science, 2020, 47(8): 227-232.
[9] CUI Jing-chun, WANG Jing. Face Expression Recognition Model Based on Enhanced Head Pose Estimation [J]. Computer Science, 2019, 46(6): 322-327.
[10] JIANG Bin,GAN Yong,ZHANG Huan-long,ZHANG Qiu-wen. Survey on Non-frontal Facial Expression Recognition Methods [J]. Computer Science, 2019, 46(3): 53-62.
[11] DU Jin, CHEN Yun-hua, ZHANG Ling, MAI Ying-chao. Energy-efficient Facial Expression Recognition Based on Improved Deep Residual Networks [J]. Computer Science, 2018, 45(9): 303-307.
[12] LIU Tao, ZHOU Xian-chun, YAN Xi-jun. LDA Facial Expression Recognition Algorithm Combining Optical Flow Characteristics with Gaussian [J]. Computer Science, 2018, 45(10): 286-290.
[13] HUANG Jian, LI Wen-shu and GAO Yu-juan. Research Advance of Facial Expression Recognition [J]. Computer Science, 2016, 43(Z11): 123-126.
[14] GE Yun. 3D Face Expression Recognition Based on Differential Operator [J]. Computer Science, 2014, 41(Z11): 128-132.
[15] PENG Hui. Research on Gabor Wavelet Transform Feature Recognition Robustness Based on Vector of Face [J]. Computer Science, 2014, 41(2): 308-311.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!