计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 255-261.doi: 10.11896/jsjkx.220300154

• 人工智能 • 上一篇    下一篇

基于多粒度实体异构图的篇章级事件抽取方法

张虎, 张广军   

  1. 山西大学计算机与信息技术学院 太原 030006
  • 收稿日期:2022-03-16 修回日期:2022-10-15 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 张虎(zhanghu@sxu.edu.cn)
  • 基金资助:
    国家自然科学基金(62176145);国家重点研发计划(2020AAA0106100)

Document-level Event Extraction Based on Multi-granularity Entity Heterogeneous Graph

ZHANG Hu, ZHANG Guangjun   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
  • Received:2022-03-16 Revised:2022-10-15 Online:2023-05-15 Published:2023-05-06
  • About author:ZHANG Hu,born in 1979,Ph.D,professor,is a member of China Computer Federation.His main research interests include natural language processing and representation learning.
  • Supported by:
    National Natural Science Foundation of China(62176145) and National Key Research and Development Program of China(2020AAA0106100).

摘要: 篇章级事件抽取是一项面向多个句子长文本的事件抽取任务,现有的篇章级事件抽取研究一般将事件抽取分为候选实体抽取、事件检测和论元识别3个子任务,且通常采用联合学习的方式进行训练。然而,已有篇章级事件抽取方法大都采用逐句的方式抽取候选实体,未考虑跨句的上下文信息,明显降低了实体抽取和论元识别的精度,影响了最终的事件抽取效果。基于此,文中提出了一种基于多粒度实体异构图的篇章级事件抽取方法。该方法分别采用Transformer和RoBerta两个独立的编码器进行句子级和段落级实体抽取;同时,提出了多粒度实体选择策略,从句子实体集和段落实体集中选择更可能是论元的实体,并进一步构造融入多粒度实体的异构图;最后,利用图卷积网络获得具有篇章级上下文感知的实体和句子表示,进行事件类型和事件论元的多标签分类,实现事件检测和论元识别。在ChFinAnn和Duee-fin数据集上进行了实验,结果表明,所提方法比以往的方法在F1值方面分别提高了约1.3%和3.9%,证明了该方法的有效性。

关键词: 篇章级事件抽取, 事件抽取, 异构图, 实体抽取, 多粒度

Abstract: Document-level event extraction is an event extraction task for long texts with multiple sentences.Existing document-level event extraction studies generally divide event extraction into three sub-tasks:candidate entity extraction,event detection and argument recognition,and usually train them with joint learning.However,most of the existing document-level event extraction methods extract candidate entities sentence-by-sentence without considering the cross-sentence contextual information,which obviously reduces the accuracy of entity extraction and argument recognition.Furthermore,it affects the final event extraction results.Based on this,this paper proposes a document-level event extraction method based on multi-granularity entity heteroge- neous graphs.This method uses two independent encoders,Transformer and RoBerta,for sentence-level and paragraph-level entity extraction respectively.Meanwhile,this paper proposes a multi-granularity entity fusion strategy to select entities that are more likely to be event arguments from the set of sentence entities and paragraph entities,and further constructs a heterogeneous graph incorporating multi-granularity entities.Finally,we use graph convolutional network to obtain document-aware entity and sentence representations for multi-label classification of event types and event arguments to achieve event detection and arguments recognition.Experiments on ChFinAnn and Duee-fin datasets show that the proposed method improves about 1.3% and 3.9% by F1 compared with previous methods,which proves its effectiveness.

Key words: Document-level event extraction, Event extraction, Heterogeneous graph, Entity extraction, Multi-granularity

中图分类号: 

  • TP391
[1]ZHENG S,CAO W,XU W,et al.Doc2EDAG:An end-to-end document-level framework for Chinese financial event extraction[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJC-NLP).2019.
[2]CHEN Y,XU L,LIN K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:167-176.
[3]NGUYEN T H,CHO K,GRISH MAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309.
[4]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018.
[5]YANG S,FENG D,QIAN L,et al.Exploring pre-trained lan-guage models for event extraction and generation[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5284-5294.
[6]DU X,CARDIE C.Event extraction by answering(almost) na-tural questions[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020.
[7]LIU J,CHEN Y,LIU K,et al.Event extraction as machinereading comprehension[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing(EMNLP).2020:1641-1651.
[8]ZHOU Y,CHEN Y,ZHAO J,et al.What the role is vs.What plays the role:Semi-supervised Event Argument Extraction via Dual Question Answering[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:14638-14646.
[9]YANG H,CHEN Y,LIU K,et al.Dcfee:A document-level Chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of the 56th Annual Mee-ting of the Association for Computational Linguistics.ACL 2018.
[10]XU R,LIU T,LI L,et al.Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics.ACL 2021.
[11]YANG H,SUI D,CHEN Y,et al.Document-level Event Extraction via Parallel Prediction Networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:6298-6308.
[12]HUANG Y,JIA W.Exploring Sentence Community for Document-Level Event Extraction[C]//Findings of the Association for Computational Linguistics:EMNLP 2021.2021:340-351.
[13]DODDINGTON G R,MITCHELL A,PRZYBOCKI M A,et al.The automatic content extraction(ace) program-tasks,data,and evaluation[C]//Language Resources and Evaluation Confe-rence.2004,2(1):837-840.
[14]LI X,LI F,PAN L,et al.DuEE:a large-scale dataset for Chinese event extraction in real-world scenarios[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:534-545.
[15]LI X.DuEE-fin:A Document-level Event Extraction Dataset in the Financial Domain Releasedby BAIDU[OL].https://aistudio.baidu.com/aistudio/competition/detail/46.
[16]VASWANI A,SHAZEERN,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[17]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019.
[1] 杨洁, 匡俊成, 王国胤, 刘群.
代价敏感的多粒度邻域粗糙模糊集的近似表示
Cost-sensitive Multigranulation Approximation of Neighborhood Rough Fuzzy Sets
计算机科学, 2023, 50(5): 137-145. https://doi.org/10.11896/jsjkx.220500268
[2] 刘松岳, 王欢.
基于多粒度特征融合的叶片分类与分级方法
Leaf Classification and Ranking Method Based on Multi-granularity Feature Fusion
计算机科学, 2023, 50(3): 216-222. https://doi.org/10.11896/jsjkx.211100203
[3] 刘露平, 周欣, 程军军, 何小海, 卿粼波, 王美玲.
基于会话式机器阅读理解模型的事件抽取方法
Event Extraction Method Based on Conversational Machine Reading Comprehension Model
计算机科学, 2023, 50(2): 275-284. https://doi.org/10.11896/jsjkx.220400271
[4] 蒲金垚, 卜令梅, 卢永美, 叶子铭, 陈黎, 于中华.
利用异构图神经网络实现情绪-原因对的有效抽取
Utilizing Heterogeneous Graph Neural Network to Extract Emotion-Cause Pairs Effectively
计算机科学, 2023, 50(1): 205-212. https://doi.org/10.11896/jsjkx.211100265
[5] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[6] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[7] 杨斐斐, 沈思妤, 申德荣, 聂铁铮, 寇月.
面向数据融合的多粒度数据溯源方法
Method on Multi-granularity Data Provenance for Data Fusion
计算机科学, 2022, 49(5): 120-128. https://doi.org/10.11896/jsjkx.210300092
[8] 缪峰, 王萍, 李太勇.
基于事件动作方向的隐式因果关系抽取方法
Implicit Causality Extraction Method Based on Event Action Direction
计算机科学, 2022, 49(3): 276-280. https://doi.org/10.11896/jsjkx.211100249
[9] 迟宇宁, 郭云飞, 王亚文, 扈红超.
一种基于多粒度特征的软件多样性评估方法
Software Diversity Evaluation Method Based on Multi-granularity Features
计算机科学, 2022, 49(12): 118-124. https://doi.org/10.11896/jsjkx.211200029
[10] 朱艺娜, 曹阳, 钟靖越, 郑泳智.
事件抽取技术研究综述
Survey on Event Extraction Technology
计算机科学, 2022, 49(12): 264-273. https://doi.org/10.11896/jsjkx.211100226
[11] 缪岚芯, 雷雨, 曾鹏鹏, 李晓瑜, 宋井宽.
基于粒度感知和语义聚合的图像-文本检索网络
Granularity-aware and Semantic Aggregation Based Image-Text Retrieval Network
计算机科学, 2022, 49(11): 134-140. https://doi.org/10.11896/jsjkx.220600010
[12] 胡艳丽, 童谭骞, 张啸宇, 彭娟.
融入自注意力机制的深度学习情感分析方法
Self-attention-based BGRU and CNN for Sentiment Analysis
计算机科学, 2022, 49(1): 252-258. https://doi.org/10.11896/jsjkx.210600063
[13] 王栋, 周大可, 黄有达, 杨欣.
基于多尺度多粒度特征的行人重识别
Multi-scale Multi-granularity Feature for Pedestrian Re-identification
计算机科学, 2021, 48(7): 238-244. https://doi.org/10.11896/jsjkx.200600043
[14] 徐进.
面向工业装配的知识图谱构建与应用研究
Construction and Application of Knowledge Graph for Industrial Assembly
计算机科学, 2021, 48(6A): 285-288. https://doi.org/10.11896/jsjkx.200600116
[15] 李艳, 范斌, 郭劼, 林梓源, 赵曌.
基于k-原型聚类和粗糙集的属性约简方法
Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets
计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!