基于Electra预训练模型并融合依存关系的中文事件检测模型

doi:10.11896/jsjkx.230600158

计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230600158-6.doi: 10.11896/jsjkx.230600158

基于Electra预训练模型并融合依存关系的中文事件检测模型

尹宝生, 孔维一

沈阳航空航天大学人机智能研究中心沈阳 110136

发布日期:2024-06-06
通讯作者: 孔维一(1120444186@qq.com)
作者简介:(54951941@qq.com)
基金资助:
辽宁省教育厅项目(LJKMZ20220536)

Electra Based Chinese Event Detection Model with Dependency Syntax Tree

YIN Baosheng, KONG Weiyi

Human-Machine Intelligence Research Center,Shenyang Aerospace University,Shenyang 110136,China

Published:2024-06-06
About author:YIN Baosheng,born in 1975,professor.His main research interests include deep learning and natural language processing.
KONG Weiyi,born in 1999,postgra-duate.Her main research interests include event detection and so on.
Supported by:
Liaoning Provincial Department of Education(LJKMZ20220536).

摘要/Abstract

摘要： 事件检测是信息提取领域的一个重要研究方向。现存的事件检测模型受到语言模型训练目标的限制,只能被动地获取词与词之间的依赖关系,使得模型在训练的过程中过多地关注与训练目标不相关的成分,从而导致检测结果错误。以往的研究表明,充分理解上下文信息对于基于深度学习的事件检测技术至关重要。因此,在Electra预训练模型的基础上,引入KVMN网络来捕捉单词之间的依赖关系,以增强单词的语义特征,并采用了一种门控机制来加权这些特征。然后,为了解决中文事件检测中模型识别错误决策的问题,在输入中加入负样本,对不同样本加入不同程度的噪声,使模型学习更好的嵌入表示,有效提高了模型对未知样本的泛化能力。最后,在公共数据集LEVEN上的实验结果表明,该方法优于现有方法,取得了93.43%的F1值。

关键词: 事件检测, 依存关系, 键值记忆网络, 门控机制, 负采样

Abstract: Event detection is an important research direction in the field of information extraction.The existing event detection models are limited by the training targets of language models,and the dependency relationship between words can only be acquired passively,so the models pay more attention to the unrelated components during training,resulting in the wrong decetion results.Previous studies show that fully understanding contextual information is crucial for deep learning-based event detection techniques.In this paper,we introduce the KVMN network to capture the dependencies between words and enhance the semantic features of words,and a gating mechanism is adapted to weight these features.Then,in order to solve the problem of the model’sidentification of wrong decisions,negative samples are added to the input,and different levels of noise are added for different samples,so that the model could learn a better embedding representation,effectively improving the model’s ability to generalise unknown samples.Finally,experimental results on the public dataset LEVEN show that this method is superior to the existing methods and achieves a F1 score of 93.43%.

Key words: Event detection, Dependency, Key-value memory network, Gating mechanism, Negative sampling

中图分类号:

TP391

尹宝生, 孔维一. 基于Electra预训练模型并融合依存关系的中文事件检测模型[J]. 计算机科学, 2024, 51(6A): 230600158-6. https://doi.org/10.11896/jsjkx.230600158

YIN Baosheng, KONG Weiyi. Electra Based Chinese Event Detection Model with Dependency Syntax Tree[J]. Computer Science, 2024, 51(6A): 230600158-6. https://doi.org/10.11896/jsjkx.230600158

参考文献

[1]DAI J H,PENG R Y,XU L,et al.A Review of Information Extraction Based on Deep Neural Networks[J].Journal of Southwest Normal University(Natural Science Edition),2022,47(4):1-11.
[2]LIU P,WEI H Z,LU X L,et al.Constructing a Mine Disaster Event Detection Model Based on a New Convolutional Neural Network[J].Journal of Chinese Information Science,2020,34(10):59-68.
[3]LI R,ZHAO W,YANG C,et al.Treasures outside contexts:Improving event detection via global statistics[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:2625-2635.
[4]LIU X,HUANG H,SHI G,et al.Dynamic prefix-tuning forgenerative template-based event extraction[J].arXiv:2205.06166,2022.
[5]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[J].arXiv:1310.4546,2013.
[6]BARBANO C A,DUFUMIER B,TARTAGLIONE E,et al.Unbiased Supervised Contrastive Learning[J].arXiv:2211.05568,2022.
[7]GU S,CHU Y,ZHANG W,et al.Research on System Log Anomaly Detection Combining Two-way Slice GRU and GA-Attention Mechanism[C]//2021 4th International Conference on Artificial Intelligence and Big Data(ICAIBD).IEEE,2021:577-583.
[8]LIU W,NGUYEN T H.Similar but not the same:Word sense disambiguation improves event detection via neural representation matching[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:4822-4828.
[9]LIU J,CHEN Y,LIU K,et al.Event detection via gated multilingual attention mechanism[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[10]LUO R Y,CHEN J F,YIN X X.A review of the application of machine learning in automatic detection of earthquake events[J].Advances in Geophysics,2021,36(3):923-932.
[11]TONG M,WANG S,CAO Y,et al.Image enhanced event detection in news articles[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:9040-9047.
[12]LU Y,LIN H,HAN X,et al.Distilling discrimination and generalization knowledge for event detection via delta-representation learning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:4366-4376.
[13]TONG M,XU B,WANG S,et al.Improving event detection via open-domain event trigger knowledge[C]//Association for Computational Linguistics.2020.
[14]NGUYEN T H,GRISHMAN R.Event detection and domainadaptation with convolutional neural networks[C]//Procee-dings of the 53rd Annual Meeting of the Association for Computa-tional Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 2:Short Papers).2015:365-371.
[15]CHEN Y,XU L,LIU K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:167-176.
[16]NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309.
[17]ZHU P P,WANG Z Q,LI S S,et al.Chinese Event Detection based on Text Information and Bi-GRU[J].Computer Science,2020,47(12):233-238.
[18]YAO F,XIAO C,WANG X,et al.LEVEN:A Large-Scale Chinese Legal Event Detection Dataset[J].arXiv:2203.08556,2022.
[19]LI X,LI F,PAN L,et al.DuEE:a large-scale dataset for Chinese event extraction in real-world scenarios[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer,2020:534-545.
[20]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[21]HSU I H,HUANG K H,BOSCHEE E,et al.DEGREE:A data-efficient generation-based event extraction model[C]//Procee-dings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2022:1890-1908.
[22]WU C,WU F,QI T,et al.NoisyTune:A Little Noise Can Help You Finetune Pretrained Language Models Better[J].arXiv:2202.12024,2022.
[23]TESNIÈRE L.Elements of structural syntax[M].John Benjamins Publishing Company,2015.
[24]CHEN Y,YANG H,LIU K,et al.Collective event detection via a hierarchical and bias tagging networks with gated multi-level attention mechanisms[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1267-1276.
[25]TIAN Y,CHEN G,SONG Y.Enhancing aspect-level sentiment analysis with word dependencies[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:3726-3739.
[26]WANG X,WANG Z,HAN X,et al.MAVEN:A massivegene-ral domain event detection dataset[J].arXiv:2004.13590,2020.
[27]DU X,CARDIE C.Event extraction by answering(almost) natural questions[J].arXiv:2004.13625,2020.
[28]YU P,JI H,NATARAJAN P.Lifelong event detection withknowledge transfer[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:5278-5290.
[29]LU Y,LIU Q,DAI D,et al.Unified Structure Generation forUniversal Information Extraction[J].arXiv:2203.12277,2022.
[30]CLARK K,LUONG M T,LE Q V,et al.Electra:Pre-training text encoders as discriminators rather than generators[J].arXiv:2003.10555,2020.
[31]YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Autoregressive Pretraining for Lanuage Understanding[J].arXiv:1906.08237.
[32]XIAO C,HU X,LIU Z,et al.Lawformer:A Pre-trained Language Model for Chinese Legal Long Documents[J].arXiv:2003.10555,2020.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于Electra预训练模型并融合依存关系的中文事件检测模型

Electra Based Chinese Event Detection Model with Dependency Syntax Tree

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0