俄语多模态情感语料库的构建及应用

doi:10.11896/jsjkx.200900088

计算机科学 ›› 2021, Vol. 48 ›› Issue (11): 312-318.doi: 10.11896/jsjkx.200900088

俄语多模态情感语料库的构建及应用

徐琳宏¹, 刘鑫¹, 原伟², 祁瑞华¹

1 大连外国语大学语言智能研究中心辽宁大连116044
2 信息工程大学河南洛阳471003

收稿日期:2020-09-10 修回日期:2021-03-26 出版日期:2021-11-15 发布日期:2021-11-10
通讯作者: 徐琳宏(qingniao1203@163.com)
基金资助:
教育部人文社科青年基金项目(18YJCZH208);国家自然科学基金(61806038,61772103)

Construction and Application of Russian Multimodal Emotion Corpus

XU Lin-hong¹, LIU Xin¹, YUAN Wei², QI Rui-hua¹

1 Research Center for Language Intelligence of Dalian University of Foreign Languages,Dalian,Liaoning 116044,China
2 Information Engineering University,Luoyang,Henan 471003,China

Received:2020-09-10 Revised:2021-03-26 Online:2021-11-15 Published:2021-11-10
About author:XU Lin-hong,born in 1979,associate professor.Her main research interests include nature language processing and sentiment analysis.
Supported by:
Ministry of Education Humanities and Social Science Project(18YJCZH208) and National Natural Science Foundation of China(61806038,61772103).

摘要/Abstract

摘要： 俄语的多模态情感分析技术是情感分析领域的研究热点,它可以通过文本、语音和图像等丰富信息自动分析和识别情感,有助于及时了解俄语区民众和国家的舆论热点。但目前俄语的多模态情感语料库还较少,因而制约了俄语情感分析技术的进一步发展。针对该问题,在分析多模态情感语料库的相关研究及情感分类方法的基础上,首先制定了一套科学完整的标注体系,标注内容包括话语、时空和情感3个部分的11项信息;然后在语料库的整个建设和质量监控过程中,遵循情感主体原则和情感连续性原则,拟订出操作性较强的标注规范,进而构建出规模较大的俄语多模态情感语料库;最后探讨了语料库在解析情感表达特点、分析人物性格特征和构造情感识别模型等多个方面的应用。

关键词: 多模态, 俄语, 情感分析, 语料库

Abstract: As a research hotspot in the field of emotion analysis,Russian multimodal sentiment analysis technology can automatically analyze and identify emotions through rich information such as text,voice and image,which is helpful to timely understand the public opinion hotspots in Russian speaking countries and areas.However,there are only a few multimodal emotion corpora in Russian,which limits the further development of Russian emotion analysis technology.Based on the analysis of the related research and emotion classification methods of multimodal emotion corpus,this paper develops a scientific and complete tagging system,which includes 11 items of information in utterance,space-time and emotion.In the whole process of corpus construction and quality control,this paper follows the principle of emotional subject and emotional continuity,formulates a strong operational annotation specification and constructs a large-scale Russian emotional corpus.Finally,it discusses the application of corpus in the analysis of emotional expression characteristics,the analysis of personality characteristics and the construction of emotion recognition model.

Key words: Corpus, Multimodal, Russian, Sentiment analysis

中图分类号:

TP393

徐琳宏, 刘鑫, 原伟, 祁瑞华. 俄语多模态情感语料库的构建及应用[J]. 计算机科学, 2021, 48(11): 312-318. https://doi.org/10.11896/jsjkx.200900088

XU Lin-hong, LIU Xin, YUAN Wei, QI Rui-hua. Construction and Application of Russian Multimodal Emotion Corpus[J]. Computer Science, 2021, 48(11): 312-318. https://doi.org/10.11896/jsjkx.200900088

参考文献

[1]BUSSO C,BULUT M,LEE C,et al.IEMOCAP:Interactiveemotional dyadic motion capture database[J].Journal of Language Resources and Evaluation,2008,42(4):335-359.
[2]DARIO B,PASCALE F.Deep Learning of Audio and Language Features for Humor Prediction[C]//Proceedings of the 10th International Conference on Language Resources and Evaluation.LREC,2016:496-501.
[3]BERTERO D,PASCALE F.Predicting humor response in dialogues from TV sitcoms[C]//IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).2016:5780-5784.
[4]BERTERO D,FUNG P.A Long Short-Term Memory Frame-work for Predicting Humor in Dialogues[C]//Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:130-135.
[5]HSU C C,KUO C C,CHEN S,et al.Emotionlines:An emotion corpus of multi-party conversations[C]//Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC).2018:1597-1601.
[6]PORIA S,HAZARIKA D,MAJUMDER N,et al.MELD:A Multimodal Multi Party-Dataset for Emotion Recognitionin Conversations[C]//Annual Meeting of the Association for Computational Linguistics.2019:527-536.
[7]WOLLMER M,WENINGER F,KNAUP T,et al.YouTubeMovie Reviews:Sentiment Analysis in an Audio-Visual Context[J].IEEE Intelligent Systems,2013,28(3):46-53.
[8]PORIA S,CAMBRIA E,HOWARD N,et al.Fusing audio,visual and textual clues for sentiment analysis from multimodal content[J].Neurocomputing,2016,174(JAN.22PT.A):50-59.
[9]ZADEH A,ZELLERS R,PINCUS E,et al.Mosi:Multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos[J].arXiv:1606.06259,2016.
[10]ZADEH A,LIANG P P,POIRA S,et al.Multimodal Language Analysis in the Wild:CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph[C]//Annual Meeting of the Association for Computational Linguistics.2018:2236-2246.
[11]WANG X M,ZHAO X B.Survey of Construction and Application of Reading Eye-tracking Corpus[J].Computer Science,2020,47(3):174-181.
[12]MAKAROVA V,PERTRUSHIN V A.RUSLANA:a Database of Russian Emotional Utterances[C]//International Conference on Spoken Language Processing.2002:1-4.
[13]COWIE R,DOUGLAS E,TSAPATSOULIS N,et al.Emotion recognition in human-computer interaction[J].IEEE Signal Processing Magazine,2002,18(1):32-80.
[14]PEREPELKINA O,KAZIMIROVA E,KONSTANTINOVAM.RAMAS:Russian Multimodal Corpus of Dyadic Interaction for Affective Computing[C]//SPECOM.2018:1-6.
[15]CARLETTA J.Assessing agreement on classification tasks:the kappa statistic[J].Computational Linguistics,1996,22(2):249-254.
[16]LI X,LI Y,WANG S G.Text Similarity Calculation for Text Sentiment Clustering[J].Journal of Chinese Information Processing,2018,32(5):97-104.
[17]WANG J C,XU Y,LIU Q Y,et al.Dialog Sentiment Analysis with Neural Topic Model[J].Journal of Chinese Information Processing,2020,34(1):106-112.
[18]WANG K,PAN W,YANG B H.OTSRM-Based Approach for Sentiment Evolution and Topic Analysis[J].Journal of the China Society for Scientific and Technical Information,2019,38(5):534-542.
[19]YANG L,ZHOU F Q,LIN H F,et al.Sentiment Analysis Based on Emotion Commonsense Knowledge[J].Journal of Chinese Information Processing,2019,33(6):94-99.
[20]SAKENOVICH N S,ZHARMAGAMBETOV A S.On one approach of solving sentiment analysis task for Kazakh and Russian languages using deep learning[C]//International Confe-rence on Computational Collective Intelligence.2016:537-545.
[21]GALINSKY R,ALEKSEEV A,NIKOLENKO S I.Improvingneural network models for natural language processing in Russian with synonyms[C]//IEEE Artificial Intelligence and Natural Language Conference.2016:1-7.
[22]ZADEH A,LIANG P P,MAZUMDER N,et al.Memory Fusion Network for Multi-view Sequential Learning[J/OL].Association for the Advancement of Artificial Intelligence, 2018:5634-5641.https://arxiv.org/abs/1802.00927.
[23]MAJUMDER N,PORIA S,HAZARIKA D,et al.Dialogue-RNN:An Attentive RNN for Emotion Detection in Conversations[C]//Association for the Advancement of Artificial Intelligence.2019:681-682.

相关文章 15

[1]	聂秀山, 潘嘉男, 谭智方, 刘新放, 郭杰, 尹义龙. 基于自然语言的视频片段定位综述 Overview of Natural Language Video Localization 计算机科学, 2022, 49(9): 111-122. https://doi.org/10.11896/jsjkx.220500130
[2]	周旭, 钱胜胜, 李章明, 方全, 徐常胜. 基于对偶变分多模态注意力网络的不完备社会事件分类方法 Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification 计算机科学, 2022, 49(9): 132-138. https://doi.org/10.11896/jsjkx.220600022
[3]	常炳国, 石华龙, 常雨馨. 基于深度学习的黑色素瘤智能诊断多模型算法 Multi Model Algorithm for Intelligent Diagnosis of Melanoma Based on Deep Learning 计算机科学, 2022, 49(6A): 22-26. https://doi.org/10.11896/jsjkx.210500197
[4]	杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥. 军事指控保障领域命名实体识别语料库的构建 Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support 计算机科学, 2022, 49(6A): 133-139. https://doi.org/10.11896/jsjkx.210400132
[5]	李浩东, 胡洁, 范勤勤. 基于并行分区搜索的多模态多目标优化及其应用 Multimodal Multi-objective Optimization Based on Parallel Zoning Search and Its Application 计算机科学, 2022, 49(5): 212-220. https://doi.org/10.11896/jsjkx.210300019
[6]	赵亮, 张洁, 陈志奎. 基于双图正则化的自适应多模态鲁棒特征学习 Adaptive Multimodal Robust Feature Learning Based on Dual Graph-regularization 计算机科学, 2022, 49(4): 124-133. https://doi.org/10.11896/jsjkx.210300078
[7]	丁锋, 孙晓. 基于注意力机制和BiLSTM-CRF的消极情绪意见目标抽取 Negative-emotion Opinion Target Extraction Based on Attention and BiLSTM-CRF 计算机科学, 2022, 49(2): 223-230. https://doi.org/10.11896/jsjkx.210100046
[8]	刘妍, 熊德意. 面向小语种机器翻译的平行语料库构建方法 Construction Method of Parallel Corpus for Minority Language Machine Translation 计算机科学, 2022, 49(1): 41-46. https://doi.org/10.11896/jsjkx.210900012
[9]	刘创, 熊德意. 多语言问答研究综述 Survey of Multilingual Question Answering 计算机科学, 2022, 49(1): 65-72. https://doi.org/10.11896/jsjkx.210900003
[10]	陈志毅, 隋杰. 基于DeepFM和卷积神经网络的集成式多模态谣言检测方法 DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection 计算机科学, 2022, 49(1): 101-107. https://doi.org/10.11896/jsjkx.201200007
[11]	袁景凌, 丁远远, 盛德明, 李琳. 基于视觉方面注意力的图像文本情感分析模型 Image-Text Sentiment Analysis Model Based on Visual Aspect Attention 计算机科学, 2022, 49(1): 219-224. https://doi.org/10.11896/jsjkx.201000074
[12]	胡艳丽, 童谭骞, 张啸宇, 彭娟. 融入自注意力机制的深度学习情感分析方法 Self-attention-based BGRU and CNN for Sentiment Analysis 计算机科学, 2022, 49(1): 252-258. https://doi.org/10.11896/jsjkx.210600063
[13]	张晓宇, 王彬, 安卫超, 阎婷, 相洁. 基于融合损失函数的3D U-Net++脑胶质瘤分割网络 Glioma Segmentation Network Based on 3D U-Net＋+ with Fusion Loss Function 计算机科学, 2021, 48(9): 187-193. https://doi.org/10.11896/jsjkx.200800099
[14]	周新民, 胡宜桂, 刘文洁, 孙荣俊. 基于多模态多层级数据融合方法的城市功能识别研究 Research on Urban Function Recognition Based on Multi-modal and Multi-level Data Fusion Method 计算机科学, 2021, 48(9): 50-58. https://doi.org/10.11896/jsjkx.210500220
[15]	戴宏亮, 钟国金, 游志铭, 戴宏明. 基于Spark的舆情情感大数据分析集成方法 Public Opinion Sentiment Big Data Analysis Ensemble Method Based on Spark 计算机科学, 2021, 48(9): 118-124. https://doi.org/10.11896/jsjkx.210400280

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

俄语多模态情感语料库的构建及应用

Construction and Application of Russian Multimodal Emotion Corpus

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0