计算机科学 ›› 2026, Vol. 53 ›› Issue (5): 99-108.doi: 10.11896/jsjkx.250600162
王丽燕1, 张倩2, 郭圆圆2, 陈海丰2, 李健2
WANG Liyan1, ZHANG Qian2, GUO Yuanyuan2, CHEN Haifeng2, LI Jian2
摘要: 英语口语在英语学习中占据重要地位。针对现有的英语口语情感表达评价数据集稀缺及模态信息利用不足的问题,构建了一个名为英语口语多模态情感数据集(English Spoken Multimodal Emotion Dataset,ESMED)的新型数据集,并对其进行连续情感(唤醒度、愉悦度)标注和情感质量评分。此外,提出了一个面向英语口语情感评价的创新网络模型,该模型首先通过感知重采样和多模态融合模块对连续情感信息进行压缩与融合,用于预测唤醒度和愉悦度。随后通过可学习的瓶颈层与联合解码层对特征进行特定变换,并通过情感质量评价模块将唤醒度、愉悦度与变换后的特征联合解码,得到最终量化后的情感质量分值。实验结果表明,在ESMED数据集上的一致性相关系数(CCC)达到0.500 3,平均绝对误差(MAE)为0.635 4,证明了该方法的有效性和准确性。
中图分类号:
| [1]PENG R Z,HU Q Q.The influence of foreign language anxietyand pleasure on learning engagement-Based on quadratic response surface regression analysis[J].Foreign Language World,2025(1):64-72. [2]CAO X M,YE X L,LUO J T,et al.Research on psychological barriers in human-computer collaborative learning supported by intelligent agents-A multimodal data comparative analysis based on an English oral communication experiment[J].Mo-dern Educational Technology,2025,35(4):102-109. [3]LI C C,LI W,JIANG G Y.Research on emotions in second language learning:Review and prospects[J].Modern Foreign Languages,2024,47(1):63-75. [4]MA G Y,ZHAO H X.Application of emotional teaching methodin ideological and political education in universities and its impact on students’ learning attitudes[J].Jilin Education,2024(35):6-8. [5]YIN K,ZHOU L.The relative importance of peace of mind,grit,and classroom environment in predicting willingness to communicate among learners in multi-ethnic regions:a latent dominance analysis[J].BMC Psychology,2025,13(1):1-17. [6]GRIMM M,KROSCHEL K,NARAYANAN S.The Vera amMittag German audio-visual emotional speech database[C]//Proceedings of the 2008 IEEE International Conference on Multimedia and Expo.IEEE,2008:23-26. [7]MCKEOWN G,VALSTAR M,COWIE R,et al.The semainedatabase:Annotated multimodal records of emotionally colored conversations between a person and a limited agent[J].IEEE Transactions on Affective Computing,2011,3(1):5-17. [8]RINGEVAL F,SONDEREGGER A,SAUER J,et al.Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions[C]//2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Re-cognition(FG).IEEE,2013:1-8. [9]GEORGAKIS C,PANAGAKIS Y,ZAFEIRIOU S,et al.Theconflict escalation resolution(confer)database[J].Image and Vision Computing,2017,65:37-48. [10]ZHANG W,JI X,CHEN K,et al.Learning a facial expression embedding disentangled from identity[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2021:6759-6768. [11]WANG C,XUE J,LU K,et al.Light attention embedding for facial expression recognition[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(4):1834-1847. [12]LYU Z,POIESI F,DONG Q,et al.Deep learning for intelligent human-computer interaction[J].Applied Sciences,2022,12(22):11457. [13]AKHAND M A H,ROY S,SIDDIQUE N,et al.Facial emotion recognition using transfer learning in the deep CNN[J].Electronics,2021,10(9):1036. [14]AMIRIPARIAN S,CHRIST L,KÖNIG A,et al.MuSe 2022Challenge:Multimodal Humour,Emotional Reactions,and Stress[C]//The 30th ACM International Conference on Multimedia.New York:ACM,2022:7389-7391. [15]YU W,XU H,MENG F,et al.Ch-sims:A Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:3718-3727. [16]CHEN Q P.Research on the Speech Emotion Analysis Modelfor English Short Passage Reading[D].Guilin:Guilin University of Electronic Technology,2023. [17]WU J H,ZHOU W T,CAO C.An empirical study on the empowerment of oral English teaching by generative artificial intelligence technology[J].China Educational Technology,2024(4):105-111. [18]LUO Y Y.Research on the intelligent evaluation method of spoken English based on multi-feature fusion[J].Computer-Assisted Foreign Language Education in China,2023(2):49-55,112. [19]WANG X.Research on the Tibetan speech emotion recognition method based on multi-feature fusion[D].Lhasa:Tibet University,2023. [20]BOCCIGNONE G,CONTE D,CUCULO V,et al.AMHUSE:Amultimodal dataset for HUmour SEnsing[C]//Proceedings of the 19th ACM International Conference on Multimodal Interaction.ACM,2017:438-445. [21]LI J,ZHANG Q,CHEN H F,et al.Continuous emotion recognition based on perceptual resampling and multimodal fusion[J].Journal of Computer Applications,2023,40(12):3816-3820. [22]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical Evaluation of Gated Recurrent NeuralNetworks on Sequence Modeling[J].arXiv:1412.3555,2014. [23]BAI S,KOLTER J Z,KOLTUN V.An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling[J].arXiv:1803.01271,2018. [24]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. |
|
||