计算机科学 ›› 2026, Vol. 53 ›› Issue (1): 195-205.doi: 10.11896/jsjkx.250900051

• 计算机图形学&多媒体 • 上一篇    下一篇

通道注意力指导全局-局部语义协同的表情识别

吕景刚, 高硕, 李玉芝, 周金   

  1. 天津财经大学理工学院 天津 300222
  • 收稿日期:2025-09-07 修回日期:2025-12-11 发布日期:2026-01-08
  • 通讯作者: 周金(zhoujin@tjufe.edu.cn)
  • 作者简介:(jingganglv@163.com)
  • 基金资助:
    天津市自然科学基金(22JCYBJC01550);天津市科技发展战略研究计划(25ZLRKZL00150);天津市教委科研计划(2023SK105,CJRHZD2308);天津市教委人文社科一般项目(2024SK103)

Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation

LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin   

  1. School of Science and Technology, Tianjin University of Finance and Economics, Tianjin 300222, China
  • Received:2025-09-07 Revised:2025-12-11 Online:2026-01-08
  • About author:LYU Jinggang,born in 1977,associate professor,Ph.D,master’s supervisor.His main research interests include speech signal processing and speech enhancement.
    ZHOU Jin,born in 1981,Ph.D,associate professor,master’s supervisor.Her main research interests include modulation recognition and spectrum sensing based on deep learning.
  • Supported by:
    Tianjin Natural Science Foundation(22JCYBJC01550),Tianjin Science and Technology Development Strategy Research Program(25ZLRKZL00150),Scientific Research Project of Tianjin Municipal Education Commission(2023SK105,CJRHZD2308) and General Project of Humanities and Social Sciences of the Tianjin Municipal Education Commission(2024SK103).

摘要: 情感识别领域,数据集常因图像质量不佳而引入噪声,导致识别准确率下降;此外,样本数量有限,导致传统深度学习网络难以高效区分噪声及纯净表情特征。为了解决上述问题,提出了一种新的含噪表情识别框架CAFSC,该框架采用自适应分组排序的通道注意力策略,并结合全局和局部特征的协同机制来提升识别性能。首先,提出了一种抗噪数据增强策略,通过随机高斯模糊、透视变换和色彩扰动等抗噪预处理技术,结合图像拼接、随机翻转和旋转,在保留原始表情的细微特征的同时,提升图像清晰度并丰富数据集多样性和模型在细微情感识别中的鲁棒性。然后,设计了自适应分组排序的通道注意力模块(Channel Attention Module with Adaptive Channel Reordering,CAM-ACR),根据通道注意力函数对通道特征进行重排序,再经分组卷积和拼接获取包含多维度语义信息的局部特征。其次,在局部-全局特征增强机制中,利用局部特征指导优化全局特征的提取,增强全局特征对复杂情感模式和上下文信息的表征能力。最后,将局部特征与全局特征输入改进的交叉注意力融合模块,实现全局与局部特征之间的双向引导与协同增强。实验结果表明,所提方法在RAF-DB,CK+,FER2013和FER2013PLUS数据集上准确率分别达到91.21%,98.31%,74.54%和86.74%,在RAF-DB上学习效率和收敛稳定性均有优势1)

关键词: 表情识别, 局部特征, 全局特征, 注意力机制, 噪声对抗

Abstract: In facial emotion recognition,noisy data caused by poor image quality often degrades recognition accuracy,while limited sample sizes hinder the ability of conventional deep learning models to distinguish noisy from clean facial features.To address these challenges,this paper proposes a novel framework,CAFSC,which integrates an adaptive channel attention strategy with a local-global collaborative mechanism to enhance recognition performance.A noise-robust data augmentation strategy is first introduced,combining Gaussian blur,perspective transformation,and color perturbation with image stitching,flipping,and rotation.This not only preserves subtle facial expression cues but also improves image clarity,dataset diversity,and model robustness.It further designs a Channel Attention Module with Adaptive Channel Reordering(CAM-ACR) that reorders channel features,followed by grouped convolution and concatenation,to capture multi-dimensional local semantics.A local-global feature enhancement mechanism is then employed,where local features guide global feature extraction to strengthen the representation of complex emotional patterns and contextual information.Finally,an improved cross-attention fusion module achieves bidirectional interaction and collaborative enhancement between global and local features.Experimental results show that CAFSC achieves accuracies of 91.21% on RAF-DB,98.31% on CK+,74.54% on FER2013,and 86.74% on FER2013PLUS,demonstrating superior lear-ning efficiency and convergence stability compared to existing methods.

Key words: Facial expression recognition, Local features, Global features, Attention mechanism, Anti-jamming

中图分类号: 

  • TP391.41
[1]TONG X Y,SUN S L,FU M X.Adaptive weight based on overlapping blocks network for facial expression recognition[J].Image and Vision Computing,2022,120:104399.
[2]ZHANG Z Y,SUN X,LI J,et al.MAN:Mining ambiguity and noise for facial expression recognition in the wild[J].Pattern Recognition Letters,2022,164:23-29.
[3]DINH H H,DO H Q,DOAN T T,et al.FGW-FER:Lightweight facial expression recognition with attention[J].KSII Transactions on Internet and Information Systems(TIIS),2023,17(9):2505-2528.
[4]HU M,HU P Y,GE P,et al.Video facial expression recognition method based on facial action units and temporal attention mechanism[J].Journal of Computer-Aided Design & Computer Graphics,2023,35(1):108-117.
[5]LIU C,HIROTA K,DAI Y P.Patch attention convolutional vision transformer for facial expression recognition with occlusion[J].Information Sciences,2023,619:781-794.
[6]PAN B,WANG S,XIA B.Occluded facial expression recognition enhanced through privileged information[C]//Proceedings of the 27th ACM International Conference on Multimedia.New York:ACM,2019:566-573.
[7]NI R,YANG B,ZHOU X,et al.Facial expression recognition through cross-modality attention fusion[J].IEEE Transactions on Cognitive and Developmental Systems,2023,15(1):175-185.
[8]CHEN C C,WANG H N,HUANG L,et al.A facial expression recognition algorithm based on local representation[J].Journal of Xidian University,2021,48(5):100-109.
[9] LIU F,FU Z,WANG Y,et al.Reward-Based Gradient Modulation for Multimodal Emotion Recognition With LoRA[J].IEEE Transactions on Computational Social Systems,2025,12(5):3301-3310.
[10]WANG K,PENG X,YANG J,et al.Region attention networks for pose and occlusion robust facial expression recognition[J].IEEE Transactions on Image Processing,2020,29:4057-4069.
[11]JI Y,HU Y,YANG Y,et al.Region attention enhanced unsupervised cross-domain facial emotion recognition[J].IEEE Transactions on Knowledge and Data Engineering,2023,35(4):4190-4201.
[12]TANG H,XIANG J L,CHEN H T,et al.Lightweight facial expression recognition network with multi-region fusion[J].Advances in Laser and Optoelectronics,2023,60(6):71-79.
[13]LIU F,WANG H,SHEN S,et al.Robust Dynamic Facial Expression Recognition[J].IEEE Transactions on Biometrics,Behavior,and Identity Science,2025,7(4):563-572.
[14]WANG H,HOU C,SHEN S,et al.Rethinking the LearningParadigm for Dynamic Facial Expression Recognition[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2023:17958-17968.
[15]LI Y,LU G,LI J,et al.Facial expression recognition in the wild using multi-level features and attention mechanisms[J].IEEE Transactions on Affective Computing,2023,14(1):451-462.
[16]QI X,YUAN F N,SHI J T,et al.Semantic segmentation algorithm of multi-level feature fusion network[J].Journal of Frontiers of Computer Science and Technology,2023,17(4):922-932.
[17]LI Y,ZENG J,SHAN S,et al.Occlusion aware facial expression recognition using CNN with attention mechanism[J].IEEE Transactions on Image Processing,2019,28(5):2439-2450.
[18]WADHAWAN R,GANDHI T K.Landmark-aware and part-based ensemble transfer learning network for static facial expression recognition from images[J].IEEE Transactions on Artificial Intelligence,2022,4(2):349-361.
[19]LIU H,CAI H,LIN Q,et al.Adaptive multilayer perceptual attention network for facial expression recognition[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(9):6253-6265.
[20]WANG K,PENG X,YANG J,et al.Suppressing Uncertainties for Large-Scale Facial Expression Recognition[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:6896-6905.
[21]ZHAO Z,LIU Q,ZHOU F.Robust lightweight facial expression recognition network with label distribution training[C]//Proceedings of the AAAI Conference on Artificial Intelligence.CA:AAAI,2021:3510-3519.
[22]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018:7132-7141.
[23]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Confe-rence on Computer Vision(ECCV).Springer,2018:3-19.
[24]WANG Q L,WU B G,ZHU P F,et al.ECA-Net:Efficient Channel Attention for Deep Convolutional Neural Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2020:5562-5571.
[25]ZHANG X Y,ZHOU X,LIN M X,et al.ShuffleNet:An Extremely Efficient Convolutional Neural Network for Mobile Devices[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018:6848-6856.
[26]YAO H,YANG X,CHEN D,et al.Facial expression recognition based on fine-tuned channel-spatial attention transformer[J].Sensors,2023,23(15):6799.
[27]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).New York:IEEE,2016:770-778.
[28]CHINNAKONDURU S S,MOHAPATRA A.Weighted grouped query attention in transformers[J].arXiv:2407.10855,2024.
[29]QIU S,ZHAO G,LI X,et al.Facial expression recognition using local sliding window attention[J].Sensors,2023,23(7):3424.
[30]SUN Q,LI Z,HE L.Depression intensity recognition based on global-local semantic correlation feature fusion and local perception enhancement of global depression features[J].Journal of Electronics & Information Technology,2024,46(5):2249-2263.
[31]ELSHEIKH R A,MOHAMED M A,ABOU-TALEB A M,et al.Improved facial emotion recognition model based on a novel deep convolutional structure[J].Scientific Reports,2024,14:29050.
[32]HUANG Z Y,CHIANG C C,CHEN J H,et al.A study on computer vision for facial emotion recognition[J].Scientific Reports,2023,13:8425.
[33]FAN J,DENG S,SONG X,et al.A gradient-based lightweight network automated design method for facial expression recognition[J].Expert Systems with Applications,2025,259:129130.
[34]HUANG Y,LIU F,ZHOU A.GSMC:A Global-Local Scalable Multi-task Contrastive Learning Framework[C]//Advances in Computer Graphics:CGI 2024.Cham:Springer,2025.
[35]REN Y,CHEN X Q,WANG D R,et al.Micro-expression Re-cognition Based on Improved Residual Network and Apex Frame[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2024(1):21-29.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!