计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 264-273.doi: 10.11896/jsjkx.211100226
朱艺娜, 曹阳, 钟靖越, 郑泳智
ZHU Yi-na, CAO Yang, ZHONG Jing-yue, ZHENG Yong-zhi
摘要: 事件抽取技术主要研究如何从非结构化自然语言文本中抽取用户感兴趣的事件信息。它是信息抽取领域的一个重要分支,近年来被广泛应用于情报分析、智能问答、信息检索和推荐系统等领域。文中从事件抽取技术概念和任务出发,对事件抽取技术的数据集和方法进行了全面综述,分析了事件抽取任务的技术研究进展,归纳总结了基于模式匹配、机器学习和深度学习的事件抽取方法;根据模型学习方式的不同和使用特征范围大小的差异,侧重介绍了基于深度学习的方法,探讨和分析了不同方法的优缺点;最后对现阶段研究面临的挑战和未来研究趋势进行归纳,针对现阶段事件抽取面临的低资源场景、模型可移植性低和篇章级事件抽取建模难度大等问题总结了当前的研究趋势。
中图分类号:
[1]SARAWAGI S.Information Extraction[J].Foundations & Trends in Databases,2008,1(3):261-377. [2]GRISHMAN R,SUNDHEIM B M.Message understanding conference-6:A brief history[C]//Proceedings of the 16th Confe-rence on Computational Linguistics.1996:466-471. [3]LIN Y.Multilingual multitask joint neural information extraction[D].Illinois:University of Illinois at Urbana-Champaign,2020. [4]ZHU D,GUO Q,ZHANG D,et al.Information Extraction Research Review[J].Journal of Physics:Conference Series,2021,1769(1):4-12. [5]GUO X,HE T.Survey about Research on information extrac-tion[J].Computer Science,2015,42(2):14-17. [6]LI M,ZAREIAN A,LIN Y,et al.GAIA:A fine-grained multimedia knowledge extraction system[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics:System Demonstrations.2020:77-86. [7]SOUZA COSTA T,GOTTSCHALK S,DEMIDOVA E.Event-QA:A dataset for event-centric question answering overknow-ledge graphs[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management.2020: 3157-3164. [8]LI M,ZHU Y,WANG R.An Empirical Study on Utilizing Neural Network for Event Information Retrieval[J].Proceedings of the 2020 International Conference on Computer Science and Communication Technology,2020,1621(1):51-56. [9]SHEN S,MURZINTCEV N,SONG C,et al.Information re-trieval of a disaster event from cross-platform social media[J].Information Discovery and Delivery,2017,45(4):220-226. [10]HOROWITZ D,CONTRERAS D,SALAMÓ M.Event Aware:A mobile recommender system for events[J].Pattern Recognition Letters,2018,105(C):121-134. [11]DODDINGTON G R,MITCHELL A,PRZYBOCKI M A,et al.The automatic content extraction(ace) program-tasks,data,and evaluation[J].Proceedings of the Fourth International Confe-rence on Language Resources and Evaluation,2004,2(1):837-840. [12]ITAMURA T,LIU Z,HOVY E H.Events Detection,Corefe-rence and Sequencing:What's next? Overview of the TAC KBP 2017 Event Track[C]//Proceedings of the 2017 Text Analysis Conference.2017:1-10. [13]KAN Z,QIAO L,YANG S,et al.Event arguments extraction via dilate gated convolutional neural network with enhanced local features[J].IEEE Access,2020,8:123483-123491. [14]LIU J,CHEN Y,LIU K,et al.How Does Context Matter? On the Robustness of Event Detection with Context-Selective Mask Generalization[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:Findings.2020:2523-2532. [15]LIU S,CHEN Y,HE S,et al.Leveraging FrameNet to improve automatic event detection[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:2134-2143. [16]ZHANG Y,XU G,WANG Y,et al.A question answering-based framework for one-step event argument extraction[J].IEEE Access,2020,8:65420-65431. [17]TIAN Y,SONG Y,AO X,et al.Joint Chinese word segmentation and part-of-speech tagging via two-way attentions of auto-analyzed knowledge[C]//Proceedings of the 58th Annual Mee-ting of the Association for Computational Linguistics.2020:8286-8296. [18]ALY R,VLACHOS A,MCDONALD R.Leveraging Type Descriptions for Zero-shotNamed Entity Recognition and Classification[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:1516-1528. [19]CAO P,ZUO X,CHEN Y,et al.Knowledge-Enriched EventCausality Identification via Latent Structure Induction Networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:4862-4872. [20]LU J,NG V.Span-Based Event Coreference Resolution[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:13489-13497. [21]HAN R,REN X,PENG N.ECONET:Effective Continual Pretraining of Language Models for Event Temporal Reasoning[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:5367-5380. [22]TAN X,PERGOLA G,HE Y.Extracting Event Temporal Relations via Hyperbolic Geometry[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proces-sing.2021:8065-8077. [23]SUNDHEIM B M.Overview of the Fourth Message Under-standing Evaluation and Conference[C]//Proceedings of the Fourth Message Understanding Conference.1992:3-21. [24]FU J.Research on event-oriented knowledge processing[D].Shanghai:Shanghai University,2010:1-2. [25]CHEN Y,LIU S,ZHANG X,et al.Automatically labeled data generation for large scale event extraction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:409-419. [26]YANG H,CHEN Y,LIU K,et al.DCFEE:A document-levelChinese financial event extraction system based on automatically labeled training data[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.System Demonstrations,2018:50-55. [27]ZHENG S,CAO W,XU W,et al.Doc2EDAG:An end-to-end document-level framework for Chinese financial event extraction[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:337-346. [28]LI S,JI H,HAN J.Document-Level Event Argument Extraction by Conditional Generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics.2021:894-908. [29]EBNER S,XIA P,CULKIN R,et al.Multi-sentence argumentlinking[C]//Proceedings of the 58th Annual Meeting of the Associationfor Computational Linguistics.2020:8057-8077. [30]TRONG H M D,LE D T,VEYSEH A P B,et al.Introducing a new dataset for event detection in cybersecurity texts[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:5381-5390. [31]LI M,ZAREIAN A,ZENG Q,et al.Cross-media StructuredCommon Space for Multimedia Event Extraction[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:2557-2568. [32]WANG X,WANG Z,HAN X,et al.MAVEN:A Massive Ge-neral Domain Event Detection Dataset[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:1652-1671. [33]LIANG X,CHENG D,YANG F,et al.F-HMTC:Detecting Financial Events for Investment Decisions Based on Neural Hie-rarchical Multi-Label Text Classification[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.2020:4490-4496. [34]DENG S,ZHANG N,KANG J,et al.Meta-learning with dynamic-memory-based prototypical network for few-shot event detection[C]//Proceedings of the 13th International Conference on Web Search and Data Mining.2020:151-159. [35]RILOFF E.Automatically constructing a dictionary for information extraction tasks[C]//Proceedings of the Eleventh National Conference on Artificial Intelligence.1993:811-816. [36]KIM J T,MOLDOVAN D I.Acquisition of linguistic patterns for knowledge-based information extraction[J].IEEE Transactions on Knowledge and Data Engineering,1995,7(5):713-724. [37]RILOFF E,SHOEN J.Automatically acquiring conceptual patterns without an annotated corpus[C]//Proceedings of the Third Workshop on Very Large Corpora.1995:148-161. [38]YANGARBER R.Scenario customization for information ex-traction[M].New York:New York University,2000:95-98. [39]JIANG J.An event IE pattern acquisition method[J].Computer Engineering,2005(15):96-98. [40]CHIEU H L,NG H T.A maximum entropy approach to information extraction from semi-structured and free text[C]//Proceedings of the Eighteenth National Conference on Artificial Intelligence.2002:786-791. [41]LLORENS H,SAQUETE E,NAVARRO B.TimeML eventsrecognition and classification:learning CRF models with semantic roles[C]//Proceedings of the 23rd International Conference on Computational Linguistics.2010:725-733. [42]LI Q,JI H,HUANG L.Joint event extraction via structured prediction with global features[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.2013:73-82. [43]AHN D.The stages of event extraction[C]//Proceedings of the Workshop on Annotating and Reasoning about Time and Events.2006:1-8. [44]CHEN Y,XU L,LIU K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th International Joint Conference on Natural Language Processing.2015:167-176. [45]YANG S,FENG D,QIAO L,et al.Exploring pre-trained language models for event extraction and generation[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5284-5294. [46]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [47]DING N,LI Z,LIU Z,et al.Event detection with trigger-aware lattice neural network[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:347-356. [48]SHI X J,CHEN Z,WANG H,et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems.2015:802-810. [49]HUANG Z,XU W,YU K.Bidirectional LSTM-CRF models for sequence tagging [EB/OL].(2015-08-09).https://arxiv.org/abs/1508.01991. [50]HUANG Z,XU W,YU K,et al.A convolution Bi-LSTM neural network model for Chinese event extraction[M]//Natural Language Understanding and Intelligent Applications.Springer,2016:275-287. [51]SATYAPANICH T,FERRARO F,FININ T.Casie:Extracting cybersecurity event information from text[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8749-8757. [52]ZHENG G,MUKHERJEE S,DONG X L,et al.Opentag:Open attribute value extraction from product profiles[C]//Procee-dings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1049-1058. [53]LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1412-1421. [54]NGUYEN T H,GRISHMAN R.Graph convolutional networks with argument-aware pooling for event detection[C]//Thirty-second AAAI Conference on Artificial Intelligence.2018:5900-5907. [55]YAN H,JIN X,MENG X,et al.Event detection with multi-order graph convolution and aggregated attention[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:5766-5770. [56]VELIČKOVIĆ P,CUCURULL G,CASANOVA A,et al.Graph attention networks[C]//Proceedings of the International Conference on Learning Representations.2018:1-12. [57]LAI V D,NGUYEN T N,NGUYEN T H.Event Detection:Gate Diversity and Syntactic Importance Scores for Graph Convolution Neural Networks[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing.2020:5405-5411. [58]CUI S,YU B,LIU T,et al.Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:Findings.2020:2329-2339. [59]VEYSEH A P B,NGUYEN T N,NGUYEN T H.GraphTransformer Networks with Syntactic and Semantic Structures for Event Argument Extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Proces-sing:Findings.2020:3651-3661. [60]YUN S,JEONG M,KIM R,et al.Graph transformer networks[J].Advances in Neural Information Processing Systems,2019,32:11983-11993. [61]HUANG K H,YANG M,PENG N.Biomedical Event Extraction on Graph Edge-conditioned Attention Networks with Hie-rarchical Knowledge Graphs[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:Findings.2020:1277-1285. [62]LI F,PENG W,CHEN Y,et al.Event extraction as multi-turn question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:Findings.2020:829-838. [63]DU X,CARDIE C.Event Extraction by Answering(Almost)Natural Questions[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:671-683. [64]LIU J,CHEN Y,LIU K,et al.Event extraction as machine reading comprehension[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing.2020:1641-1651. [65]CHEN Y,CHEN T,EBNER S,et al.Reading the Manual:Event Extraction as Definition Comprehension[C]//Proceedings of the Fourth Workshop on Structured Prediction for NLP.2020:74-83. [66]NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309. [67]SHA L,QIAN F,CHANG B,et al.Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction[C]//Proceedings of the Thirty-Se-cond AAAI Conference on Artificial Intelligence.2018:5916-5923. [68]HUANG P,ZHAO X,TAKANOBU R,et al.Joint Event Extraction with Hierarchical Policy Network[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:2653-2664. [69]SHEN S,QI G,LI Z,et al.Hierarchical Chinese legal event extraction via pedal attention mechanism[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:100-113. [70]SHENG J,GUO S,YU B,et al.CasEE:A Joint LearningFramework with Cascade Decoding for Overlapping Event Extraction[C]//The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Proces-sing.2021:164-174. [71]NGUYEN T M,NGUYEN T H.One for all:Neural joint mo-deling of entities and events[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6851-6858. [72]ZHANG J,QIN Y,ZHANG Y,et al.Extracting Entities andEvents as a Single Task Using a Transition-Based Neural Model[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence.2019:5422-5428. [73]DU X,RUSH A M,CARDIE C.GRIT:Generative Role-fillerTransformers for Document-level Event Entity Extraction[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:634-644. [74]ZHANG Z,KONG X,LIU Z,et al.A two-step approach for implicit event argument detection[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7479-7485. [75]LIN H,LU Y,HAN X,et al.Sequence-to-Nuggets:Nested Entity Mention Detection via Anchor-Region Networks[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5182-5192. [76]DU X,CARDIE C.Document-Level Event Role Filler Extrac-tion using Multi-Granularity Contextualized Encoding[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:8010-8020. [77]HUANG K H,PENG N.Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies[C]//Proceedings of the Third Workshop on Narrative Understanding.2021:36-47. [78]XU R,LIU T,LI L,et al.Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker[C]//The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:1024-1036. [79]HUANG L,JI H,CHO K,et al.Zero-Shot Transfer Learning for Event Extraction[C]//Proceedings of the 56th Annual Meeting ofthe Association for Computational Linguistics.2018:2160-2170. [80]WADDEN D,WENNBERG U,LUAN Y,et al.Entity,Relation,and Event Extraction with Contextualized Span Representations[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:5784-5789. [81]HSI A,YANG Y,CARBONELL J G,et al.Leveraging multilingual training for limited resource event extraction[C]//Procee-dings of COLING 2016 and the 26th International Conference on Computational Linguistics.2016:1201-1210. [82]AHMAD W U,PENG N,CHANG K W.GATE:Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:12462-12470. [83]LIU J,CHEN Y,LIU K,et al.Event detection via gated multilingual attention mechanism[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.2018:4865-4872. [84]LAI V D,DERNONCOURT F,NGUYEN T H.Extensivelymatching for few-shot learning event detection[C]//Procee-dings of the First Joint Workshop on Narrative Understanding,Storylines,and Events.2020:38-45. [85]ZHENG J,CAI F,CHEN W,et al.Taxonomy-aware Learning for Few-Shot Event Detection[C]//Proceedings of the Web Conference 2021.2021:3546-3557. |
[1] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[2] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[3] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[4] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[5] | 李小伟, 舒辉, 光焱, 翟懿, 杨资集. 自然语言处理在简历分析中的应用研究综述 Survey of the Application of Natural Language Processing for Resume Analysis 计算机科学, 2022, 49(6A): 66-73. https://doi.org/10.11896/jsjkx.210600134 |
[6] | 缪峰, 王萍, 李太勇. 基于事件动作方向的隐式因果关系抽取方法 Implicit Causality Extraction Method Based on Event Action Direction 计算机科学, 2022, 49(3): 276-280. https://doi.org/10.11896/jsjkx.211100249 |
[7] | 张虎, 柏萍. 融入句子中远距离词语依赖的图卷积短文本分类方法 Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification 计算机科学, 2022, 49(2): 279-284. https://doi.org/10.11896/jsjkx.201200062 |
[8] | 刘晓影, 王淮, 乌吉斯古愣. 基于GAN和中文词汇网的文本摘要技术 GAN and Chinese WordNet Based Text Summarization Technology 计算机科学, 2022, 49(12): 301-304. https://doi.org/10.11896/jsjkx.210600166 |
[9] | 阿布都克力木·阿布力孜, 张雨宁, 阿力木江·亚森, 郭文强, 哈里旦木·阿布都克里木. 预训练语言模型的扩展模型研究综述 Survey of Research on Extended Models of Pre-trained Language Models 计算机科学, 2022, 49(11A): 210800125-12. https://doi.org/10.11896/jsjkx.210800125 |
[10] | 徐晖, 王中卿, 李寿山, 张民. 结合情感信息的个性化对话生成 Personalized Dialogue Generation Integrating Sentimental Information 计算机科学, 2022, 49(11A): 211100019-6. https://doi.org/10.11896/jsjkx.211100019 |
[11] | 王浩宇. 软件需求工程技术综述 Review on Technologies of Requirement Engineering of Software 计算机科学, 2022, 49(11A): 210900132-14. https://doi.org/10.11896/jsjkx.210900132 |
[12] | 陈志毅, 隋杰. 基于DeepFM和卷积神经网络的集成式多模态谣言检测方法 DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection 计算机科学, 2022, 49(1): 101-107. https://doi.org/10.11896/jsjkx.201200007 |
[13] | 王立梅, 朱旭光, 汪德嘉, 张勇, 邢春晓. 基于深度学习的民事案件判决结果分类方法研究 Study on Judicial Data Classification Method Based on Natural Language Processing Technologies 计算机科学, 2021, 48(8): 80-85. https://doi.org/10.11896/jsjkx.210300130 |
[14] | 侯春萍, 赵春月, 王致芃. 基于自反馈最优子类挖掘的视频异常检测算法 Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining 计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146 |
[15] | 卿来云, 张建功, 苗军. 在线异常事件检测的时序建模 Temporal Modeling for Online Anomaly Detection 计算机科学, 2021, 48(7): 206-212. https://doi.org/10.11896/jsjkx.200900093 |
|