计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 292-298.doi: 10.11896/jsjkx.200500133

• 人工智能 • 上一篇    下一篇

BGCN:基于BERT和图卷积网络的触发词检测

程思伟1, 葛唯益2, 王羽2, 徐建1   

  1. 1 南京理工大学计算机科学与工程学院 南京210094
    2 中国电子科技集团公司第二十八研究所信息系统工程重点实验室 南京210007
  • 收稿日期:2020-05-26 修回日期:2020-08-10 出版日期:2021-07-15 发布日期:2021-07-02
  • 通讯作者: 徐建(dolphin.xu@njust.edu.cn)
  • 基金资助:
    国家自然科学基金(61872186);信息系统工程重点实验室开放基金(05201901)

BGCN:Trigger Detection Based on BERT and Graph Convolution Network

CHENG Si-wei1, GE Wei-yi2, WANG Yu2, XU Jian1   

  1. 1 School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
    2 Key Laboratory of Information System Engineering,28th Research Institute of China Electronic Science and Technology Group Corporation 210007,China
  • Received:2020-05-26 Revised:2020-08-10 Online:2021-07-15 Published:2021-07-02
  • About author:CHENG Si-wei,born in 1997,postgra-duate.His main research interests include natural language processing and machine learning.(735665705@qq.com)
    XU Jian,born in 1979,Ph.D, professor, master director. His main research interests include data mining and know-ledge graph.
  • Supported by:
    National Natural Science Foundation of China(61872186) and Science and Technology on Information System Engineering Laboratory(05201901).

摘要: 触发词检测是事件抽取的一项基本任务,该任务涉及对触发词进行识别和分类。目前,已有工作主要存在两方面的问题:1)用于触发词检测的神经网络模型只考虑了句子的顺序表示,且通过顺序建模的方法在捕捉长距离依赖关系时效率较低;2)基于表示的方法虽然解决了手动提取特征的问题,但用作初始训练特征的词向量对句子的表示程度有所欠缺,难以捕捉深层的双向表征。因此,文中提出了一种基于BERT模型和GCN网络的触发词检测模型BGCN,该模型通过引入BERT词向量来强化特征表示,并引入句法结构来捕捉长距离依赖,对事件触发词进行检测。实验结果表明,所提方法在ACE2005数据集上的表现优于其他现有的神经网络模型。

关键词: BERT, 事件触发词, 双向LSTM, 图卷积网络, 序列标注

Abstract: Trigger word detection is a basic task of event extraction,which involves the recognition and classification of trigger words.There are two main problems in the previous work:(1)the neural network model for trigger word detection only consi-ders the sequential representation of sentences,and the sequential modeling method is inefficient in capturing long-distance dependencies;(2)although the representation-based method overcomes the problem of manual feature extraction,the word vector used as the initial training feature lacks the degree of representation of the sentence,so it is difficult to capture the deep two-way representation.Therefore,we propose a trigger word detection model BGCN,based on BERT model and GCN network.This model strengthens the feature representation by introducing BERT word vector,and introduces syntactic structure to capture long-distance dependencies and detect event trigger words.Experimental results show that our method outperforms other existing neural network models on ACE2005 datasets.

Key words: BERT, Bi-LSTM, Event trigger, Graph convolution network, Sequence annotation

中图分类号: 

  • TP183
[1]GRISHMAN R,WESTBROOK D,MEYERS A.NYU’s Eng-lish ACE 2005 system description[J/OL].ACE,2005,5.http://www.researchgate.net/publication/228638184_NYU’s_English_ACE_2005_system_description.
[2]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[3]PENNINGTON J,SOCHER R,MANNING C.Glove:Globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.
[4]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[J].arXiv:1802.05365,2018.
[5]LIU S,LIU K,HE S,et al.A probabilistic soft logic based approach to exploiting latent and global information in event classification[C]//Thirtieth AAAI Conference on Artificial Intelligence.2016.
[6]LI X,NGUYEN T H,CAO K,et al.Improving event detection with abstract meaning representation[C]//Proceedings of the First Workshop on Computing News Storylines.2015:11-15.
[7]NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309.
[8]CHEN Y,XU L,LIU K,et al.Event extraction via dynamicmulti-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natu-ral Language Processing (Volume 1:Long Papers).2015:167-176.
[9]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[10]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[11]MARCHEGGIANI D,TITOV I.Encoding sentences with graph convolutional networks for semantic role labeling[J].arXiv:1703.04826,2017.
[12]NGUYEN T H,GRISHMAN R.Graph convolutional networks with argument-aware pooling for event detection[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[13]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[J].arXiv:1809.09078,2018.
[14]LI Q,JI H,HUANG L.Joint event extraction via structured prediction with global features[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2013:73-82.
[15]ZHANG X F,GUO Z G,LIU S,et al.Self-similarity Clustering Event Detection Based on Triggers Guid [J].Computer Science,2010,27(3):212-214.
[16]XU X,LI P F,ZHU Q M.Pattern Filtering and ConversionMethods for Semi-supervised Chinese Event Extraction.[J].Computer Science,2015,42(2):253-255.
[17]LIU S,CHENG R,YU X,et al.Exploiting contextual information via dynamic memory network for event detection[J].arXiv:1810.03449,2018.
[18]LIU S,LI Y,ZHANG F,et al.Event Detection without Triggers[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:735-744.
[19]ZHANG J,QIN Y,ZHANG Y,et al.Extracting Entities andEvents as a Single Task Using a Transition-Based Neural Model[C]//IJCAI.2019:5422-5428.
[20]ORR J W,TADEPALLI P,FERN X.Event detection with neural networks:A rigorous empirical evaluation[J].arXiv:1808.08504,2018.
[21]SHA L,QIAN F,CHANG B,et al.Jointly extracting event triggers and arguments by dependency-bridge rnn and tensor-based argument interaction[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[22]LIU S,CHEN Y,LIU K,et al.Exploiting argument information to improve event detection via supervised attention mechanisms[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2017:1789-1798.
[23]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[24]YAN H R,JIN X L,MENG X B,et al.Event Detection with Multi-Order Graph Convolution and Aggregated Attention[C]//The 9th International Joint Conference on Natural Language Processing.2019:5770-5774.
[25]CUI S Y,YU B W,LIU T W,et al.Event Detection with Relation-Aware Graph Convolutional Networks[J].arXiv:2002.10757,2020.
[26]PENG H,LI J,GONG Q,et al.Fine-grained event categorization with heterogeneous graph convolutional networks[J].arXiv:1906.04580,2019.
[27]DUAN S,HE R,ZHAO W.Exploiting document level information to improve event detection via recurrent neural networks[C]//Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1:Long Papers).2017:352-361.
[28]LIU J,CHEN Y,LIU K,et al.Event detection via gated multilingual attention mechanism[C]//Thirty-Second AAAI Confe-rence on Artificial Intelligence.2018.
[1] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[2] 于家畦, 康晓东, 白程程, 刘汉卿.
一种新的中文电子病历文本检索模型
New Text Retrieval Model of Chinese Electronic Medical Records
计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198
[3] 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩.
融合Bert和图卷积的深度集成学习软件需求分类
Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution
计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[4] 余本功, 张子薇, 王惠灵.
一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法
TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information
计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238
[5] 李健智, 王红玲, 王中卿.
基于图卷积网络的专利摘要自动生成研究
Automatic Generation of Patent Summarization Based on Graph Convolution Network
计算机科学, 2022, 49(6A): 172-177. https://doi.org/10.11896/jsjkx.210400117
[6] 赵小虎, 叶圣, 李晓.
多算法融合的骨骼重建信息动作分类方法
Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction
计算机科学, 2022, 49(6): 269-275. https://doi.org/10.11896/jsjkx.210500070
[7] 郭雨欣, 陈秀宏.
融合BERT词嵌入表示和主题信息增强的自动摘要模型
Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement
计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101
[8] 周海榆, 张道强.
面向多中心数据的超图卷积神经网络及应用
Multi-site Hyper-graph Convolutional Neural Networks and Application
计算机科学, 2022, 49(3): 129-133. https://doi.org/10.11896/jsjkx.201100152
[9] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[10] 解宇, 杨瑞玲, 刘公绪, 李德玉, 王文剑.
基于动态拓扑图的人体骨架动作识别算法
Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph
计算机科学, 2022, 49(2): 62-68. https://doi.org/10.11896/jsjkx.210900059
[11] 宋龙泽, 万怀宇, 郭晟楠, 林友芳.
面向出租车空载时间预测的多任务时空图卷积网络
Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction
计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089
[12] 宋元隆, 吕光宏, 王桂芝, 贾吾财.
基于图卷积神经网络的SDN网络流量预测
SDN Traffic Prediction Based on Graph Convolutional Network
计算机科学, 2021, 48(6A): 392-397. https://doi.org/10.11896/jsjkx.200800090
[13] 董哲, 邵若琦, 陈玉梁, 翟维枫.
基于BERT和对抗训练的食品领域命名实体识别
Named Entity Recognition in Food Field Based on BERT and Adversarial Training
计算机科学, 2021, 48(5): 247-253. https://doi.org/10.11896/jsjkx.200800181
[14] 吕明琪, 洪照雄, 陈铁明.
一种融合时空关联与社会事件的交通流预测方法
Traffic Flow Forecasting Method Combining Spatio-Temporal Correlations and Social Events
计算机科学, 2021, 48(2): 264-270. https://doi.org/10.11896/jsjkx.200300098
[15] 余诗媛, 郭淑明, 黄瑞阳, 张建朋, 苏珂.
嵌套命名实体识别研究进展
Overview of Nested Named Entity Recognition
计算机科学, 2021, 48(11A): 1-10. https://doi.org/10.11896/jsjkx.201100165
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!