计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 255-261.doi: 10.11896/jsjkx.220300154

• 人工智能 • 上一篇    下一篇

基于多粒度实体异构图的篇章级事件抽取方法

张虎, 张广军   

  1. 山西大学计算机与信息技术学院 太原 030006
  • 收稿日期:2022-03-16 修回日期:2022-10-15 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 张虎(zhanghu@sxu.edu.cn)
  • 基金资助:
    国家自然科学基金(62176145);国家重点研发计划(2020AAA0106100)

Document-level Event Extraction Based on Multi-granularity Entity Heterogeneous Graph

ZHANG Hu, ZHANG Guangjun   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
  • Received:2022-03-16 Revised:2022-10-15 Online:2023-05-15 Published:2023-05-06
  • About author:ZHANG Hu,born in 1979,Ph.D,professor,is a member of China Computer Federation.His main research interests include natural language processing and representation learning.
  • Supported by:
    National Natural Science Foundation of China(62176145) and National Key Research and Development Program of China(2020AAA0106100).

摘要: 篇章级事件抽取是一项面向多个句子长文本的事件抽取任务,现有的篇章级事件抽取研究一般将事件抽取分为候选实体抽取、事件检测和论元识别3个子任务,且通常采用联合学习的方式进行训练。然而,已有篇章级事件抽取方法大都采用逐句的方式抽取候选实体,未考虑跨句的上下文信息,明显降低了实体抽取和论元识别的精度,影响了最终的事件抽取效果。基于此,文中提出了一种基于多粒度实体异构图的篇章级事件抽取方法。该方法分别采用Transformer和RoBerta两个独立的编码器进行句子级和段落级实体抽取;同时,提出了多粒度实体选择策略,从句子实体集和段落实体集中选择更可能是论元的实体,并进一步构造融入多粒度实体的异构图;最后,利用图卷积网络获得具有篇章级上下文感知的实体和句子表示,进行事件类型和事件论元的多标签分类,实现事件检测和论元识别。在ChFinAnn和Duee-fin数据集上进行了实验,结果表明,所提方法比以往的方法在F1值方面分别提高了约1.3%和3.9%,证明了该方法的有效性。

关键词: 篇章级事件抽取, 事件抽取, 异构图, 实体抽取, 多粒度

Abstract: Document-level event extraction is an event extraction task for long texts with multiple sentences.Existing document-level event extraction studies generally divide event extraction into three sub-tasks:candidate entity extraction,event detection and argument recognition,and usually train them with joint learning.However,most of the existing document-level event extraction methods extract candidate entities sentence-by-sentence without considering the cross-sentence contextual information,which obviously reduces the accuracy of entity extraction and argument recognition.Furthermore,it affects the final event extraction results.Based on this,this paper proposes a document-level event extraction method based on multi-granularity entity heteroge- neous graphs.This method uses two independent encoders,Transformer and RoBerta,for sentence-level and paragraph-level entity extraction respectively.Meanwhile,this paper proposes a multi-granularity entity fusion strategy to select entities that are more likely to be event arguments from the set of sentence entities and paragraph entities,and further constructs a heterogeneous graph incorporating multi-granularity entities.Finally,we use graph convolutional network to obtain document-aware entity and sentence representations for multi-label classification of event types and event arguments to achieve event detection and arguments recognition.Experiments on ChFinAnn and Duee-fin datasets show that the proposed method improves about 1.3% and 3.9% by F1 compared with previous methods,which proves its effectiveness.

Key words: Document-level event extraction, Event extraction, Heterogeneous graph, Entity extraction, Multi-granularity

中图分类号: 

  • TP391
[1]ZHENG S,CAO W,XU W,et al.Doc2EDAG:An end-to-end document-level framework for Chinese financial event extraction[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJC-NLP).2019.
[2]CHEN Y,XU L,LIN K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:167-176.
[3]NGUYEN T H,CHO K,GRISH MAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309.
[4]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018.
[5]YANG S,FENG D,QIAN L,et al.Exploring pre-trained lan-guage models for event extraction and generation[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5284-5294.
[6]DU X,CARDIE C.Event extraction by answering(almost) na-tural questions[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020.
[7]LIU J,CHEN Y,LIU K,et al.Event extraction as machinereading comprehension[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing(EMNLP).2020:1641-1651.
[8]ZHOU Y,CHEN Y,ZHAO J,et al.What the role is vs.What plays the role:Semi-supervised Event Argument Extraction via Dual Question Answering[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:14638-14646.
[9]YANG H,CHEN Y,LIU K,et al.Dcfee:A document-level Chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of the 56th Annual Mee-ting of the Association for Computational Linguistics.ACL 2018.
[10]XU R,LIU T,LI L,et al.Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics.ACL 2021.
[11]YANG H,SUI D,CHEN Y,et al.Document-level Event Extraction via Parallel Prediction Networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:6298-6308.
[12]HUANG Y,JIA W.Exploring Sentence Community for Document-Level Event Extraction[C]//Findings of the Association for Computational Linguistics:EMNLP 2021.2021:340-351.
[13]DODDINGTON G R,MITCHELL A,PRZYBOCKI M A,et al.The automatic content extraction(ace) program-tasks,data,and evaluation[C]//Language Resources and Evaluation Confe-rence.2004,2(1):837-840.
[14]LI X,LI F,PAN L,et al.DuEE:a large-scale dataset for Chinese event extraction in real-world scenarios[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:534-545.
[15]LI X.DuEE-fin:A Document-level Event Extraction Dataset in the Financial Domain Releasedby BAIDU[OL].https://aistudio.baidu.com/aistudio/competition/detail/46.
[16]VASWANI A,SHAZEERN,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[17]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!