计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 255-261.doi: 10.11896/jsjkx.220300154
张虎, 张广军
ZHANG Hu, ZHANG Guangjun
摘要: 篇章级事件抽取是一项面向多个句子长文本的事件抽取任务,现有的篇章级事件抽取研究一般将事件抽取分为候选实体抽取、事件检测和论元识别3个子任务,且通常采用联合学习的方式进行训练。然而,已有篇章级事件抽取方法大都采用逐句的方式抽取候选实体,未考虑跨句的上下文信息,明显降低了实体抽取和论元识别的精度,影响了最终的事件抽取效果。基于此,文中提出了一种基于多粒度实体异构图的篇章级事件抽取方法。该方法分别采用Transformer和RoBerta两个独立的编码器进行句子级和段落级实体抽取;同时,提出了多粒度实体选择策略,从句子实体集和段落实体集中选择更可能是论元的实体,并进一步构造融入多粒度实体的异构图;最后,利用图卷积网络获得具有篇章级上下文感知的实体和句子表示,进行事件类型和事件论元的多标签分类,实现事件检测和论元识别。在ChFinAnn和Duee-fin数据集上进行了实验,结果表明,所提方法比以往的方法在F1值方面分别提高了约1.3%和3.9%,证明了该方法的有效性。
中图分类号:
[1]ZHENG S,CAO W,XU W,et al.Doc2EDAG:An end-to-end document-level framework for Chinese financial event extraction[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJC-NLP).2019. [2]CHEN Y,XU L,LIN K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:167-176. [3]NGUYEN T H,CHO K,GRISH MAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309. [4]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018. [5]YANG S,FENG D,QIAN L,et al.Exploring pre-trained lan-guage models for event extraction and generation[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5284-5294. [6]DU X,CARDIE C.Event extraction by answering(almost) na-tural questions[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020. [7]LIU J,CHEN Y,LIU K,et al.Event extraction as machinereading comprehension[C]//Proceedings of the 2020 Confe-rence on Empirical Methods in Natural Language Processing(EMNLP).2020:1641-1651. [8]ZHOU Y,CHEN Y,ZHAO J,et al.What the role is vs.What plays the role:Semi-supervised Event Argument Extraction via Dual Question Answering[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:14638-14646. [9]YANG H,CHEN Y,LIU K,et al.Dcfee:A document-level Chinese financial event extraction system based on automatically labeled training data[C]//Proceedings of the 56th Annual Mee-ting of the Association for Computational Linguistics.ACL 2018. [10]XU R,LIU T,LI L,et al.Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics.ACL 2021. [11]YANG H,SUI D,CHEN Y,et al.Document-level Event Extraction via Parallel Prediction Networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:6298-6308. [12]HUANG Y,JIA W.Exploring Sentence Community for Document-Level Event Extraction[C]//Findings of the Association for Computational Linguistics:EMNLP 2021.2021:340-351. [13]DODDINGTON G R,MITCHELL A,PRZYBOCKI M A,et al.The automatic content extraction(ace) program-tasks,data,and evaluation[C]//Language Resources and Evaluation Confe-rence.2004,2(1):837-840. [14]LI X,LI F,PAN L,et al.DuEE:a large-scale dataset for Chinese event extraction in real-world scenarios[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:534-545. [15]LI X.DuEE-fin:A Document-level Event Extraction Dataset in the Financial Domain Releasedby BAIDU[OL].https://aistudio.baidu.com/aistudio/competition/detail/46. [16]VASWANI A,SHAZEERN,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [17]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019. |
[1] | 杨洁, 匡俊成, 王国胤, 刘群. 代价敏感的多粒度邻域粗糙模糊集的近似表示 Cost-sensitive Multigranulation Approximation of Neighborhood Rough Fuzzy Sets 计算机科学, 2023, 50(5): 137-145. https://doi.org/10.11896/jsjkx.220500268 |
[2] | 刘松岳, 王欢. 基于多粒度特征融合的叶片分类与分级方法 Leaf Classification and Ranking Method Based on Multi-granularity Feature Fusion 计算机科学, 2023, 50(3): 216-222. https://doi.org/10.11896/jsjkx.211100203 |
[3] | 刘露平, 周欣, 程军军, 何小海, 卿粼波, 王美玲. 基于会话式机器阅读理解模型的事件抽取方法 Event Extraction Method Based on Conversational Machine Reading Comprehension Model 计算机科学, 2023, 50(2): 275-284. https://doi.org/10.11896/jsjkx.220400271 |
[4] | 蒲金垚, 卜令梅, 卢永美, 叶子铭, 陈黎, 于中华. 利用异构图神经网络实现情绪-原因对的有效抽取 Utilizing Heterogeneous Graph Neural Network to Extract Emotion-Cause Pairs Effectively 计算机科学, 2023, 50(1): 205-212. https://doi.org/10.11896/jsjkx.211100265 |
[5] | 秦琪琦, 张月琴, 王润泽, 张泽华. 基于知识图谱的层次粒化推荐方法 Hierarchical Granulation Recommendation Method Based on Knowledge Graph 计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111 |
[6] | 张源, 康乐, 宫朝辉, 张志鸿. 基于Bi-LSTM的期货市场关联交易行为检测方法 Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM 计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304 |
[7] | 杨斐斐, 沈思妤, 申德荣, 聂铁铮, 寇月. 面向数据融合的多粒度数据溯源方法 Method on Multi-granularity Data Provenance for Data Fusion 计算机科学, 2022, 49(5): 120-128. https://doi.org/10.11896/jsjkx.210300092 |
[8] | 缪峰, 王萍, 李太勇. 基于事件动作方向的隐式因果关系抽取方法 Implicit Causality Extraction Method Based on Event Action Direction 计算机科学, 2022, 49(3): 276-280. https://doi.org/10.11896/jsjkx.211100249 |
[9] | 迟宇宁, 郭云飞, 王亚文, 扈红超. 一种基于多粒度特征的软件多样性评估方法 Software Diversity Evaluation Method Based on Multi-granularity Features 计算机科学, 2022, 49(12): 118-124. https://doi.org/10.11896/jsjkx.211200029 |
[10] | 朱艺娜, 曹阳, 钟靖越, 郑泳智. 事件抽取技术研究综述 Survey on Event Extraction Technology 计算机科学, 2022, 49(12): 264-273. https://doi.org/10.11896/jsjkx.211100226 |
[11] | 缪岚芯, 雷雨, 曾鹏鹏, 李晓瑜, 宋井宽. 基于粒度感知和语义聚合的图像-文本检索网络 Granularity-aware and Semantic Aggregation Based Image-Text Retrieval Network 计算机科学, 2022, 49(11): 134-140. https://doi.org/10.11896/jsjkx.220600010 |
[12] | 胡艳丽, 童谭骞, 张啸宇, 彭娟. 融入自注意力机制的深度学习情感分析方法 Self-attention-based BGRU and CNN for Sentiment Analysis 计算机科学, 2022, 49(1): 252-258. https://doi.org/10.11896/jsjkx.210600063 |
[13] | 王栋, 周大可, 黄有达, 杨欣. 基于多尺度多粒度特征的行人重识别 Multi-scale Multi-granularity Feature for Pedestrian Re-identification 计算机科学, 2021, 48(7): 238-244. https://doi.org/10.11896/jsjkx.200600043 |
[14] | 徐进. 面向工业装配的知识图谱构建与应用研究 Construction and Application of Knowledge Graph for Industrial Assembly 计算机科学, 2021, 48(6A): 285-288. https://doi.org/10.11896/jsjkx.200600116 |
[15] | 李艳, 范斌, 郭劼, 林梓源, 赵曌. 基于k-原型聚类和粗糙集的属性约简方法 Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets 计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053 |
|