计算机科学 ›› 2026, Vol. 53 ›› Issue (5): 257-267.doi: 10.11896/jsjkx.260300053
徐伟华, 胡开平
XU Weihua, HU Kaiping
摘要: 语音情感识别(Speech Emotion Recognition,SER)在人机交互系统中具有重要作用。为了解决现有深度学习模型在SER任务中决策过程不透明,以及传统概念认知学习(Concept-Cognitive Learning,CCL)在处理增量数据时易受噪声干扰而产生概念漂移的问题,构建了一种融合极端随机树权重机制的三支模糊概念认知分类框架(3WERT-WFCCL)。在特征处理上,模型采用Whisper提取高维语音特征,并经由多层感知机进行分层抽象表示;在认知学习阶段,引入极端随机树算法计算特征重要性以实现属性权重的自动量化分配,并在认知算子中嵌入三支决策的容错阈值参数,构建正负双向认知机制。面对增量数据时,模型依据特征辨识距离将新样本划分为正域、边界域和负域,并采用仅利用正域样本更新概念的鲁棒策略,有效抵御了噪声干扰。在特征边界较为复杂的SAVEE数据集上,鲁棒更新策略相比全局更新策略的准确率提升了0.16个百分点。在EmoDB 和SAVEE两个公开数据集上进行相关实验,3WERT-WFCCL在多个关键评价指标上均优于现有基线方法。相比各数据集上表现最优的逻辑回归(Logistic Regression,LR)算法,所提出方法的准确率分别提升了1.53个百分点和0.62个百分点,F1分数分别提升了1.28个百分点和0.40个百分点。实验结果验证了引入三支决策机制的有效性,为构建兼顾高分类精度、强抗噪能力与逻辑可解释性的SER模型提供了新的方法。
中图分类号:
| [1]SCHULLER B.Speech emotion recognition:two decades in a nutshell,benchmarks,and ongoing trends[J].Communications of the ACM,2018,61(5):90-99. [2]HOOK J,NOROOZI F,TOYGAR O,et al.Automaticspeechbased emotion recognition using paralinguistics features[J].Bulletin of the Polish Academy of Sciences:Technical Sciences,2019,67(3):479-488. [3]INGALE A,CHAUDHARI D.Speech emotion recognition[J].International Journal of Soft Computing and Engineering,2012,2(1):235-238. [4]EYBEN F,SCHERER K R,SCHULLER B W,et al.The Gene-va minimalistic acoustic parameter set(GeMAPS) for voice research and affective computing[J].IEEE Transactions on Affective Computing,2016,7(2):190-202. [5]RADFORD A,KIM J W,XU T,et al.Robust speech recognitionvia large-scale weak supervision[C]//International Conference on Machine Learning.PMLR,2023:28492-28518. [6]PEPINO L,RIERA P,FERRER L.Emotion recognition fromspeech using wav2vec 2.0 embeddings[C]//Interspeech.2021:3400-3404. [7]MA W,RIVERA H,VALERI J,et al.emotion2vec:Self-supervised pre-training for speech emotion representation[J].arXiv:2312.15185,2023. [8]LIU Z T,XU J P,WU M,et al.Review of emotional feature extraction and dimension reduction for speech emotion recognition[J].Chinese Journal of Computers,2018,41(12):2833-2851. [9]JOY J,KANNAN A,RAM S,et al.Speech emotion recognition using neural network and MLP classifier[J].International Journal of Engineering Science and Computing,2020,10(4):25170-25173. [10]KAUR J,KUMAR A.Speech emotion recognition using CNN,k-NN,MLP and random forest[C]//Computer Networks and Inventive Communication Technologies:Proceedings of Third ICCNCT 2020.Singapore:Springer,2021:499-509. [11]CHEN L,MAO X,XUE Y,et al.Speech emotion recognition:features and classification models[J].Digital Signal Processing,2012,22(6):1154-1160. [12]ISSA D,DEMIRCI M,YAZICI A.Speech emotion recognition with deep convolutional neural networks[J].Biomedical Signal Processing and Control,2020,59:101894. [13]AHMED M,ISLAM S,ISLAM A,et al.An ensemble 1D-CNN-LSTM-GRU model with data augmentation for speech emotion recognition[J].Expert Systems with Applications,2023,218:119633. [14]GUO D,XU W,DING W,et al.Concept-cognitive learning survey:mining and fusing knowledge from data[J].Information Fusion,2024,109:102426. [15]SHI Y,MI Y,LI J,et al.Concept-cognitive learning model for incremental concept learning[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2018,51(2):809-821. [16]MI Y,LIU W,SHI Y,et al.Semi-supervised concept learning by concept-cognitive learning and concept space[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(5):2429-2442. [17]XU W,GUO D,QIAN Y,et al.Two-way concept-cognitivelearning method:a fuzzy-based progressive learning[J].IEEE Transactions on Fuzzy Systems,2022,31(6):1885-1899. [18]ZHANG T,RONG M,SHAN H,et al.Stability analysis of incremental concept tree for concept cognitive learning[J].International Journal of Machine Learning and Cybernetics,2022,13(1):11-28. [19]DENG X,LI J,QIAN Y,et al.An emerging incremental fuzzy concept-cognitive learning model based on granular computing and conceptual knowledge clustering[J].IEEE Transactions on Emerging Topics in Computational Intelligence,2024,8(3):2417-2432. [20]XIN X W,YU H Y,XUE Z,et al.A novel fuzzy concept-cognitive learning model with attribute fluctuation and concept clustering[J].IEEE Transactions on Fuzzy Systems,2025,33(10):3570-3581. [21]QI J,WEI L,REN R.3-way concept analysis based on 3-valued formal contexts[J].Cognitive Computation,2022,14(1):1900-1912. [22]LI J,HUANG C,QI J,et al.Three-way cognitive concept lear-ning via multi-granularity[J].Information Sciences,2017,378:244-263. [23]YUAN K,XU W,LI W,et al.An incremental learning mechanism for object classification based on progressive fuzzy three-way concept[J].Information Sciences,2022,584(1):127-147. [24]XU W H,JIANG D.A novel concept-cognitive learning model oriented to three-way concept for knowledge acquisition[J].IEEE Transactions on Big Data,2025,11(5):2779-2791. [25]GUO D,XU W.Fuzzy-based concept-cognitive learning:an in-vestigation of novel approach to tumor diagnosis analysis[J].Information Sciences,2023,639:118998. [26]GUO D,XU W,QIAN Y,et al.M-FCCL:memory-based concept-cognitive learning for dynamic fuzzy data classification and knowledge fusion[J].Information Fusion,2023,100:101962. [27]GUO D,XU W,QIAN Y,et al.Fuzzy-granular concept-cogni-tive learning via three-way decision:performance evaluation on dynamic knowledge discovery[J].IEEE Transactions on Fuzzy Systems,2024,32(1):1409-1423. [28]BURKHARDT F,PAESCHKE A,ROLFES M,et al.A database of German emotional speech[C]//InterSpeech.2005:1517-1520. [29]HAQ S,JACKSON P J B.Multimodal emotion recognition[C]//Machine Audition:Principles,Algorithms and Systems.IGI Global,2010:398-423. |
|
||