计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 279-293.doi: 10.11896/jsjkx.220300099
王剑1, 彭雨琦2, 赵宇斐2, 杨健3
WANG Jian1, PENG Yu-qi2, ZHAO Yu-fei2, YANG Jian3
摘要: 随着社交媒体平台的快速发展,舆情信息得以在极短的时间内大范围传播,如果不对舆情信息加以管理和控制,将对网络环境乃至社会环境造成巨大威胁。信息抽取技术因其语义化和精准性成为舆情分析和管理的第一步,也是最关键的一步。近年来,随着深度学习的发展,其自动学习潜在特征、组合特征的能力使信息抽取各个子任务的准确率都得到了很大的提高。文中结合社交网络舆情的特点和深度学习技术在信息抽取领域的应用,对基于深度学习的社交网络舆情信息抽取方法进行了系统的梳理和总结。首先整理了社交网络舆情信息的组织方式,详细阐述了舆情信息抽取的框架、评价指标,然后对现有的基于深度学习的舆情信息抽取模型进行了全面的回顾和分析,讨论了现有方法的适用性及局限性,最后对未来的研究趋势进行了展望。
中图分类号:
[1]MA Z K,TU Y.Online Emerging Topic Content MonitoringBased on Knowledge Graph[J].Information Science,2019,37(2):33-39. [2]WANG X W,XING Y F,WEI Y N,et al.Research on the Topic Model Construction of Sentiment Classification of Public Opi-nion Users in Social Networks Driven by Big Data——Taking “Immigration” as the Topic[J].Journal of Information Resources Management,2020,10(1):29-38,48. [3]LIANG Y,LI X Y,XU H,et al.CLOpin:A Cross-LingualKnowledge Graph Framework for Public Opinion Analysis and Early Warning[J].Data Analysis and Knowledge Discovery,2020,4(6):1-14. [4]GUO X Y,HE T T.Survey about Reasearch on Information Extraction[J].Computer Science,2015,42(2):14-17,38. [5]HUANG W,XU Y J,HAN R X,et al.Study on SemanticOrientation Membership of Network Public Opinion[J].Libraryand Information Work,2015,59(21):27-32. [6]QIAN S S,ZHANG T Z,XU C S.Survey of Multimedia Social Events Analysis[J].Computer Science,2021,48(3):97-112. [7]ZHENG M,MA Y,ZHENG A,et al.Constructing method of public opinion knowledge graph with online news comments[C]//2018 International Conference on Robots & Intelligent System(ICRIS).Changsha,China:IEEE,2018:404-408. [8]ROSSI C,ACERBO F S,YLINEN K,et al.Early detection and information extraction for weather-induced floods using social media streams[J/OL].International Journal of Disaster Risk Reduction,2018,30:145-157.https://linkinghub.elsevier.com/retrieve/pii/S2212420918302735. [9]NEMES L,KISS A.Information Extraction and Named Entity Recognition Supported Social Media Sentiment Analysis during the COVID-19 Pandemic[J].Applied Sciences,2021,11(22):11017. [10]LI Z,DAI Y,LI X.Construction of sentimental knowledge graph of chinese government policy comments[J/OL].Knowledge Management Research & Practice,2021:1-18.https://doi.org/10.1080/14778238.2021.1971056. [11]WEI M Z,ZHANG H T,ZHOU H L.Research on the Management Thought of Chinese Ancient Library[J].Information Science,2021,39(6):10-18,54. [12]CHEN J Y,XIA L X,LIU X Y.Visual Analysis of Network Public Opinion Feature Evolution Based on Topic Map [J].Information Science,2021,39(5):75-84. [13]SAHN X H,PANG S H,LIU X Y,et al.Research on Internet Public Opinion Event Prediction Method Based on Event Evolution Graph[J].Information studies:Theory & Application,2020,43(10):165-170,156. [14]SHAN X H,PANG S H,LIU X Y,et al.Analysis on the Evolution Path of Internet Public Opinions Based on the Event Evolution Graph:Taking Medical Public Opinions as an Example[J].Information studies:Theory & Application,2019,42(9):99-103,85. [15]WU F,ZHU P P,WANG Z Q,et al.Chinese Event Detection with Joint Representation of Characters and Words[J].Compu-ter Science,2021,48(4):249-253. [16]QIU L Q,QU F S.Emotional map about emergency based on sentiment analysis and influence evaluation[J/OL].Journal of Computer Applications.http://kns.cnki.net/kcms/detail/51.1307.TP.20210720.1404.003.html. [17]LI T R,LIU M T,ZHANG Y J,et al.A Review of Entity Lin-king Research Based on Deep Learning [J].Acta Scientiarum Naturalium Universitatis Pekinensis,2021,57(1):91-98. [18]ZHUANG C Z,JIN X L,ZHU W J,et al.Deep Learning Based Relation Extraction:A Survey[J].Journal of Chinese Information Processing,2019,33(12):1-18. [19]HU Y,HUANG H,CHEN A,et al.Weibo-cov:a large-scale covid-19 social media dataset from weibo[J/OL].arXiv:2005.09174,2020. [20]WANG G,LIU S,WEI F.Weighted graph convolution over dependency trees for nontaxonomic relation extraction on public opinion information[J].Applied Intelligence,2022,52(3):3403-3417. [21]PENG N,DREDZE M.Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon,Portugal:Association for Computational Linguistics,2015:548-554. [22]ROSALES-MÉNDEZ H,HOGAN A,POBLETE B.Fine-grained evaluation for entity linking[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:718-727. [23]HUANG Z,XU W,YU K.Bidirectional lstm-crf models for sequence tagging[J].arXiv:1508.01991,2015. [24]ŽUKOV-GREGORI A,BACHRACH Y,COOPE S.Named entity recognition with parallel recurrent neural networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).Melbourne,Australia:Association for Computational Linguistics,2018:69-74. [25]ELIGÜZEL N,ÇETINKAYA C,DERELI T.Application ofnamed entity recognition on tweets during earthquake disaster:a deep learning-based approach[J].Soft Computing,2022,26(1):395-421. [26]QIU J,ZHOU Y,WANG Q,et al.Chinese clinical named entity recognition using residual dilated convolutional neural network with conditional random field[J].IEEE Transactions on NanoBioscience,2019,18(3):306-315. [27]KONG J,ZHANG L,JIANG M,et al.Incorporating multi-level cnn and attention mechanism for chinese clinical named entity recognition[J/OL].Journal of Biomedical Informatics,2021,116.https://linkinghub.elsevier.com/retrieve/pii/S1532046421000666. [28]AGUILAR G,MAHARJAN S,LÓPEZ-MONROY A P,et al.A multi-task approach for named entity recognition in social media data[J/OL].Proceedings of the 3rd Workshop on Noisy User-generated Text,2017:148-153.http://arxiv.org/abs/1906.04135. [29]RONRAN C,LEE S.Effect of character and word features in bidirectional lstm-crf for ner[C]//2020 IEEE International Conference on Big Data and Smart Computing(BigComp).Busan,Korea(South):IEEE,2020:613-616. [30]ALIFI M R,SUPANGKAT S H.Information extraction of traffic condition from social media using bidirectional lstm-cnn[C]//2018 International Seminar on Research of Information Technology and Intelligent Systems(ISRITI).Yogyakarta,Indonesia:IEEE,2018:637-640. [31]MOON S,NEVES L,CARVALHO V.Multimodal named entity recognition for short social media posts[J].arXiv:1802.07862,2018. [32]ARSHAD O,GALLO I,NAWAZ S,et al.Aiding intra-text representations with visual context for multimodal named entity recognition[J/OL].arXiv:1904.01356.http://arxiv.org/abs/1904.01356. [33]ASGARI-CHENAGHLU M,FEIZI-DERAKHSHI M R,FAR-ZINVASH L,et al.A multimodal deep learning approach for named entity recognition from social media[J].Neural Computing and Applications,2022,34(3):1905-1922. [34]YAN H,DENG B,LI X,et al.TENER:adapting transformer encoder for named entity recognition[J].arXiv:1911.04474,2019. [35]NIE Y,TIAN Y,WAN X,et al.Named entity recognition forsocial media texts with semantic augmentation[J].arXiv:2010.15458,2020. [36]BAEVSKI A,EDUNOV S,LIU Y,et al.Cloze-driven pretrai-ning of self-attention networks[J].arXiv:1903.07785,2019. [37]LI X,SUN X,MENG Y,et al.Dice loss for data-imbalanced nlp tasks[J].arXiv:1911.02855,2020. [38]LI X,YAN H,QIU X,et al.FLAT:Chinese NER using flat-lattice transformer[J].arXiv:2004.11795,2020. [39]GUPTA N,SINGH S,ROTH D.Entity linking via joint encoding of types,descriptions,and context[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Copenhagen,Denmark:Association for Computational Linguistics,2017:2681-2690. [40]PHAN M C,SUN A,TAY Y,et al.NeuPL:attention-based semantic matching and pair-linking for entity disambiguation[C]//ACM Conference on Information and Knowledge Management.Singapore:ACM,2017:1667-1676. [41]SIL A,KUNDU G,FLORIAN R,et al.Neural cross-lingual entity linking[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018. [42]ESHEL Y,COHEN N,RADINSKY K,et al.Named entity disam-biguation for noisy text[C]//Proceedings of the 21st Confe-rence on Computational Natural Language Learning(CoNLL 2017).Vancouver,Canada:Association for Computational Linguistics,2017:58-68. [43]MUELLER D,DURRETT G.Effective use of context in noisy entity linking[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:1024-1029. [44]XUE M,CAI W,SU J,et al.Neural collective entity linkingbased on recurrent random walk network learning[C]//Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19).Macao,China,2019:5327-5333. [45]YANG X,GU X,LIN S,et al.Learning dynamic context augmentation for global entity linking[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).Hong Kong,China:Association for Computational Linguistics,2019:271-281. [46]HOU F,WANG R,HE J,et al.Improving entity linkingthrough semantic reinforced entityembeddings[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6843-6848. [47]RAIMAN J,RAIMAN O.Deeptype:multilingual entity linking by neural type system evolution[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018. [48]LI T,YANG E,ZHANG Y,et al.Improving entity linking by encoding type information into entity embeddings[M]//Chinese Computational Linguistics.Cham:Springer International Publishing,2021:297-307. [49]GANEA O E,HOFMANN T.Deep joint entity disambiguation with local neural attention[C]//Proceedings of the 2017Confe-rence on Empirical Methods in NaturalLanguage Processing.Copenhagen,Denmark:Association for Computational Linguistics,2017:2619-2629. [50]PERSHINA M,HE Y,GRISHMAN R.Personalized page rank for named entity disambiguation[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Denver,Colorado:Association for Computational Linguistics,2015:238-243. [51]FRANCIS-LANDAU M,DURRETT G,KLEIN D.Capturingsemantic similarity for entity linking with convolutional neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego,California:Association for Computational Linguistics,2016:1256-1261. [52]BAI L,JIN X L,XI P B,et al.A Surney on Distant Supervision Based Relation Extraction[J].Journal of Chinese Information Processing,2019,33(10):10-17. [53]ZHONG Z,CHEN D.A frustratingly easy approach for entity and relation extraction[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Online:Association for Computational Linguistics,2021:50-61. [54]LIU C Y,SUN W B,CHAO W H,et al.Convolution neural network for relation extraction[C]//International Conference on Advanced Data Mining and Applications.Berlin:Springer,2013:231-242. [55]ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of COLING 2014,the 25th International Conference on Computational Linguistics:Technical Papers.2014:2335-2344. [56]DOS SANTOS C,XIANG B,ZHOU B.Classifying relations by ranking with convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).Beijing,China:Association for Computational Linguistics,2015:626-634. [57]WANG L,CAO Z,DE MELO G,et al.Relation classification via multi-level attention cnns[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Vo-lume 1:Long Papers).Berlin,Germany:Association for Computational Linguistics,2016:1298-1307. [58]LEE J,SEO S,CHOI Y S.Semantic relation classification via bidirectional lstm networks with entity-aware attention using latent entity typing[J/OL].Symmetry,2019,11(6):785.https://www.mdpi.com/2073-8994/11/6/785. [59]WU S,HE Y.Enriching pre-trained language model with entity information for relation classification[C]//CIKM ’19:The 28th ACM International Conference on Information and Knowledge Management.Beijing China:ACM,2019:2361-2364. [60]BALDINI SOARES L,FITZGERALD N,LING J,et al.Ma-tching the blanks:distributional similarity for relation learning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence,Italy:Association for Computational Linguistics,2019:2895-2905. [61]WANG Y,YU B,ZHANG Y,et al.TPLinker:single-stage joint extraction of entities and relations through token pair linking[C]//Proceedings of the 28th International Conference on Computational Linguistics.Barcelona,Spain:International Committee on Computational Linguistics,2020:1572-1582. [62]REN F,ZHANG L,YIN S,et al.A novel global feature-oriented relational triple extraction model based on table filling[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Online and Punta Cana,Dominican Republic:Association for Computational Linguistics,2021:2646-2656. [63]ZHENG S,WANG F,BAO H,et al.Joint extraction of entities and relations based on a novel tagging scheme[J].arXiv:1706.05075,2018. [64]YU B,ZHANG Z,SHU X,et al.Joint extraction of entities and relations based on a novel decomposition strategy[J].arXiv:1909.04273,2020. [65]ZENG X,ZENG D,HE S,et al.Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Association for ComputationalLinguistics(Volume 1:Long Papers).Australia:Association for Computational Linguistics,2018:506-514. [66]ZENG D,ZHANG H,LIU Q.CopyMTL:copy mechanism for joint extraction of entities and relations with multi-task learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:9507-9514. [67]YANG H Z,LIU Y X,ZHANG K W,et al.Survey on Distantly-Supervised Relation Extraction[J].Chinese Journal of Compu-ters,2021,44(8):1636-1660. [68]CHEN Y,XU L,LIU K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).Beijing,China:Association for Computational Linguistics,2015:167-176. [69]ZENG Y,YANG H,FENG Y,et al.A convolution bilstm neural network model for chinese event extraction[M]//Natural Language Understanding and Intelligent Applications.Cham:Springer International Publishing,2016:275-287. [70]NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego,California:Association for Computational Linguistics,2016:300-309. [71]FENG X,QIN B,LIU T.A language-independent neural net-work for event detection[J].Science China Information Sciences,2018,61(9):1-12. [72]ORR J W,TADEPALLI P,FERN X.Event detection with neural networks:a rigorous empirical evaluation[J].arXiv:1808.08504,2018. [73]SHA L,QIAN F,CHANG B,et al.Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018. [74]YU W,YI M,HUANG X,et al.Make it directly:event extraction based on tree-lstm and bi-gru[J].IEEE Access,2020,8:14344-14354. [75]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[J].arXiv:1809.09078,2019. [76]YAN H,JIN X,MENG X,et al.Event detection with multi-order graph convolution and aggregated attention[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).Hong Kong,China:Association for Computational Linguistics,2019:5765-5769. [77]BALALI A,ASADPOUR M,CAMPOS R,et al.Joint event extraction along shortest dependency paths using graph convolutional networks[J].Knowledge-Based Systems,2020,210:106492. [78]YANG S,FENG D,QIAO L,et al.Exploring pre-trained lan-guage models for event extraction and generation[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence,Italy:Association for Computational Linguistics,2019:5284-5294. [79]KAN Z,QIAO L,YANG S,et al.Event arguments extraction via dilate gated convolutional neural network with enhanced local features[J].IEEE Access,2020,8:123483-123491. [80]WANG Z,WANG X,HAN X,et al.CLEVE:contrastive pre-training for event extraction[J].arXiv:2105.14485,2018. [81]BANARESCU L,BONIAL C,CAI S,et al.Abstract meaning representation for sembanking[C]//Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse.2013:178-186. [82]WANG R,ZHOU D,HE Y.Open event extraction from online text using a generative adversarial network[J].arXiv:1908.09246,2019. [83]LIU J,CHEN Y,LIU K,et al.Event extraction as machine reading comprehension[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing (EMNLP).Association for Computational Linguistics,2020:1641-1651. [84]YU S Y,GUO S M,HUANG R Y,et al.Overview of Nested Named Entity Recognition[J].Computer Science,2021,48(S2):1-10,29. |
[1] | 周旭, 钱胜胜, 李章明, 方全, 徐常胜. 基于对偶变分多模态注意力网络的不完备社会事件分类方法 Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification 计算机科学, 2022, 49(9): 132-138. https://doi.org/10.11896/jsjkx.220600022 |
[2] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[3] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[9] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[10] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[11] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[12] | 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138 |
[13] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[14] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[15] | 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩. 基于Transformer和LSTM的药物相互作用预测 Drug-Drug Interaction Prediction Based on Transformer and LSTM 计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150 |
|