Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer

ZHANG Jia-hao1, LIU Feng2,3,4, QI Jia-yin4   

  1. 1 School of Computer Science and Technology,East China Normal University,Shanghai 200062,China
    2 Shanghai Institute of Intelligent Education,East China Normal University,Shanghai 200062,China
    3 Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention,Other Institutes,School of Psychology and Cognitive Science,East China Normal University,Shanghai 200062,China
    4 Institute of Artificial Intelligence and Change Management,Shanghai University of International Business and Economics,Shanghai 201620,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:ZHANG Jia-hao,born in 2000,undergraduate,is a student member of the China Computer Federation.His main reasearch interests include affective computing,computer vision and deep learning.
    LIU Feng,born in 1988,Ph.Dcandidate,engineer,is a senior member of China Computer Federation.His main research interests include deep lear-ning,cognitive science and blockchain technology.
  • Supported by:
    Digital Transformation in China and Germany:Strategies,Structures and Solutions for Ageing Societies(GZ1570),Research Project of Shanghai Science and Technology Commission(20dz2260300) and Fundamental Research Funds for the Central Universities.

Abstract: Micro-expressions are spontaneous facial movements at a marginal spatiotemporal scale,which reveal one's true fee-lings.Its duration is short,the amplitude of the movement is slight,and it is difficult to recognize,but it has important research value.In order to solve the micro-expression recognition problem,a novel extremely lightweight micro-expression recognition neural architecture is proposed.The neural network which takes apex-onset optical-flow features as the input and integrates approaches in residual convolutional networks and visual Transformers,could effectively solve the micro-expression sentiment classification problem.This architecture containsnovel parameter-saving residual blocks,and a bottleneck Transformer block which replace the convolution operators in residual blocks with self-attention mechanism.The model evaluation experiments are conducted with a LOSO cross-validation strategy on a combined database con-sists of the 3 CASME datasets.With obviously fewer total parameters(39 685),the model achieves an average recall of 73.09% and an average F1-Score of 72.25%,exceeding those mainstream architectures in this domain.A series ablation experiments are also conducted to ensure the superiority of the optical strain strength,self-attention mechanism and relativeposition encoding.

Key words: Computational affection, Micro-expression recognition, Residual convolutional neural network, Self-attention mechanism, Visual Transformer

CLC Number: 

  • TP301.6
