Computer Science ›› 2022, Vol. 49 ›› Issue (7): 132-141.doi: 10.11896/jsjkx.210100085
• Computer Graphics & Multimedia • Previous Articles Next Articles
XU Ming-ke1, ZHANG Fan2
CLC Number:
[1]HAN K,YU D,TASHEV I.Speech emotion recognition using deep neural network and extreme learning machine[C]//Fifteenth Annual Conference of the International Speech Communication Association.2014. [2]CHEN M,HE X,YANG J,et al.3-D convolutional recurrent neuralnetworks with attention model for speech emotion recognition[J].IEEE Signal Processing Letters,2018,25(10):1440-1444. [3]WU X,LIU S,CAO Y,et al.Speech emotion recognition using capsule networks[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).IEEE,2019:6695-6699. [4]XU Y,XU H,ZOU J.HGFM:A Hierarchical Grained and Feature Model for Acoustic Emotion Recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2020).IEEE,2020:6499-6503. [5]PRIYASAD D,FERNANDO T,DENMAN S,et al.AttentionDriven Fusion for Multi-Modal Emotion Recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2020).IEEE,2020:3227-3231. [6]NEDIYANCHATH A,PARAMASIVAM P,YENIGALLA P.Multi-Head Attention for Speech Emotion Recognition with Auxiliary Learning of Gender Recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2020).IEEE,2020:7179-7183. [7]CHATTERJEE A,GUPTA U,CHINNAKOTLA M K,et al.Understanding emotions in text using deep learning and big data[J].Computers in Human Behavior,2019,93:309-317. [8]BATBAATAR E,LI M,RYU K H.Semantic-emotion neural network for emotion recognition from text[J].IEEE Access,2019,7:111866-111878. [9]YANG J,ZHANG F,CHEN B,et al.Facial Expression Recognition Based on Facial Action Unit[C]//2019 Tenth International Green and Sustainable Computing Conference(IGSC).IEEE,2019:1-6. [10]LIU X,VIJAYA KUMAR B V K,YOU J,et al.Adaptive deep metric learning for identity-aware facial expression recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017:20-29. [11]LOW L S A,MADDAGE N C,LECH M,et al.Detection of clinical depression in adolescents' speech during family interactions[J].IEEE Transactions on Biomedical Engineering,2010,58(3):574-586. [12]YOON W J,CHO Y H,PARK K S.A study of speech emotion recognition and its application tomobile services[C]//International Conference on Ubiquitous Intelligence and Computing.Berlin:Springer,2007:758-766. [13]HUAHU X,JUE G,JIAN Y.Application of speech emotionrecognition in intelligent household robot[C]//2010 International Conference on Artificial Intelligence and Computational Intelligence.IEEE,2010,1:537-541. [14]ARNOLD G F,O'CONNOR J D.Intonation of colloquial English[M].Longman,London,1973. [15]EKMAN P,OSTER H.Facial expressions of emotion[J].An-nual Review of Psychology,1979,30(1):527-554. [16]EL AYADI M,KAMEL M S,KARRAY F.Survey on speechemotion recognition:Features,classification schemes,and databases[J].Pattern Recognition,2011,44(3):572-587. [17]RUSSELL J A,MEHRABIAN A.Evidence for a three-factor theory of emotions[J].Journal of Research in Personality,1977,11(3):273-294. [18]ALTROV R,PAJUPUU H.The influence of language and culture on the understanding of vocal emotions[J].Journal of Estonian and Finno-Ugric Linguistics,2015,6(3):11-48. [19]TARANTINO L,GARNER P N,LAZARIDIS A.Self-Attention for Speech Emotion Recognition[C]//INTERSPEECH.2019:2578-2582. [20]LI P,YAN S,MCLOUGHLIN I,et al.An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition[C]// INTERSPEECH.2018:3087-3091. [21]SCHULLER B,RIGOLL G,LANG M.HiddenMarkov model-based speech emotion recognition[C]//2003 IEEE International Conference on Acoustics,Speech,and Signal Processing(IC-ASSP'03).IEEE,2003. [22]MOWER E,MATARIĆM J,NARAYANAN S.A framework for automatic human emotion classification using emotion profiles[J].IEEE Transactions on Audio,Speech,and Language Processing,2010,19(5):1057-1070. [23]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [24]PHAM N Q,NGUYEN T S,NIEHUES J,et al.Very deep self-attention networks for end-to-end speech recognition[J].arXiv:1904.13377,2019. [25]SAFARI P,HERNANDO J.Self multi-head attention for spea-ker recognition[J].arXiv:1906.09890,2019. [26]ZHAO Z,BAO Z,ZHANG Z,et al.Attention-Enhanced Con-nectionist Temporal Classification for Discrete Speech Emotion Recognition[C]//INTERSPEECH.2019:206-210. [27]YOON S,BYUN S,DEY S,et al.Speech emotion recognitionusing multi-hop attention mechanism[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).IEEE,2019:2822-2826. [28]PANDHARIPANDE M,CHAKRABORTY R,PANDA A,et al.Robust front-end processing for emotion recognition in noisy speech[C]//1th International Symposium on Chinese Spoken Language Processing(ISCSLP 2018).IEEE,2018:324-328. [29]HUANG Y,TIAN K,WU A,et al.Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition[J].Journal of Ambient Intelligence and Humanized Computing,2019,10(5):1787-1798. [30]HUANG Y,XIAO J,TIAN K,et al.Research on Robustness of Emotion Recognition Under Environmental Noise Conditions[J].IEEE Access,2019,7:142009-142021. [31]CHEN S,JIN Q.Multi-modal conditional attention fusion for dimensional emotion prediction[C]//Proceedings of the 24th ACM International Conference on Multimedia.2016:571-575. [32]PARTHASARATHY S,BUSSO C.Semi-supervised speechemotion recognition with ladder networks[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2020,28:2697-2709. [33]BUSSO C,BULUT M,LEE C C,et al.IEMOCAP:Interactiveemotional dyadic motion capture database[J].Language Resources and Evaluation,2008,42(4):335. [34]NEUMANN M,VU N T.Improving speech emotion recognition with unsupervised representation learning on unlabeled speech[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).IEEE,2019:7390-7394. [35]LIVINGSTONE S R,RUSSO F A.The Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS):A dyna-mic,multimodal set of facial and vocal expressions in North American English[J].PloS one,2018,13(5):e0196391. [36]PICZAK K J.ESC:Dataset for environmental sound classification[C]//Proceedings of the 23rd ACM International Confe-rence on Multimedia.2015:1015-1018. [37]MCFEE B,RAFFEL C,LIANG D,et al.librosa:Audio and music signal analysis in python[C]//Proceedings of the 14th Python in Science Conference.2015:18-25. [38]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [39]ZENG Y,MAO H,PENG D,et al.Spectrogram based multi-task audio classification[J].Multimedia Tools and Applications,2019,78(3):3705-3722. [40]JALAL M A,LOWEIMI E,MOORE R K,et al.Learning Temporal Clusters Using Capsule Routing for Speech Emotion Reco-gnition[C]//INTERSPEECH.2019:1701-1705. [41]ISSA D,DEMIRCI M F,YAZICI A.Speech emotion recognition with deep convolutional neural networks[J].Biomedical Signal Processing and Control,2020,59:101894. [42]LI H,DING W,WU Z,et al.Learning Fine-Grained Multimodal Alignment for Speech Emotion Recognition[J].arXiv:2010.12733,2020. |
[1] | ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63. |
[2] | DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145. |
[3] | ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161. |
[4] | XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182. |
[5] | RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207. |
[6] | JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335. |
[7] | WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48. |
[8] | CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85. |
[9] | ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119. |
[10] | SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177. |
[11] | YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236. |
[12] | JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186. |
[13] | XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219. |
[14] | PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247. |
[15] | ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105. |
|