计算机科学 ›› 2021, Vol. 48 ›› Issue (5): 202-208.doi: 10.11896/jsjkx.200800038

所属专题: 自然语言处理 虚拟专题

• 人工智能 • 上一篇    下一篇

基于分层次多粒度语义融合的中文事件检测

丁玲, 向阳   

  1. 同济大学电子与信息工程学院 上海201804
  • 收稿日期:2020-08-10 修回日期:2020-09-15 出版日期:2021-05-15 发布日期:2021-05-09
  • 通讯作者: 向阳(tjdxxiangyang@gmail.com)
  • 基金资助:
    国家重点基础研究发展计划(2019YFB1704402)

Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion

DING Ling, XIANG Yang   

  1. School of Electronics and Information Engineering,Tongji University,Shanghai 201804,China
  • Received:2020-08-10 Revised:2020-09-15 Online:2021-05-15 Published:2021-05-09
  • About author:DING Ling,born in 1995,doctoral student,is a member of China Compu-ter Federation.Her main research in-terests include natural language proces-sing,information extraction and event extraction.(dling@tongji.edu.cn)
    XIANG Yang,born in 1962,Ph.D,professor,is a member of China Computer Federation.His main research interests include machine learning,data mining and natural language processing.
  • Supported by:
    National Basic Research Program of China(2019YFB1704402).

摘要: 事件检测是信息抽取领域中一个重要的研究方向,其主要研究如何从非结构化自然语言文本中提取出事件的触发词,并识别出事件的类型。现有的基于神经网络的方法通常将事件检测看作单词的分类问题,但是这会引起中文事件检测触发词与文本中词语不匹配的问题。此外,由于中文词语的一词多义性,在不同的语境下,相同的词语可能会存在歧义性问题。针对中文事件检测中的这两个问题,提出了一个分层次多粒度语义融合的中文事件检测模型。首先,该模型利用基于字符序列标注的方法解决了触发词不匹配的问题,同时设计了字符-词语融合门机制,以获取多种分词结果中词语的语义信息;然后,通过设计字符-句子融合门机制,考虑整个句子的语义信息,学习序列的字-词-句混合表示,消除词语的歧义性;最后,为了平衡“O”标签与其他标签之间的数量差异,采用了带有偏差的损失函数对模型进行训练。在广泛使用的ACE2005数据集上进行了大量实验,实验结果表明,所提模型在精确率(Precision,P)、召回率(Recall,R)和F1值这3个指标上比现有的中文事件检测模型至少高出3.9%,1.4%和2.9%,证明了所提方法的有效性。

关键词: 多粒度语义融合, 卷积神经网络, 双向长短期记忆模型, 信息抽取, 预训练语言模型, 中文事件检测

Abstract: Event detection is an important task in information extraction field,which aims to identify trigger words in raw text and then classify them into correct event types.Neural network based methods usually regard event detection as a word-wise classification task,which suffers from the mismatch problem between words and triggers when applied to Chinese.Besides,due to the multiple word senses of a trigger word,the same trigger word in different sentences causes the ambiguity problem.To address the two problems in Chinese event detection,we propose a Chinese event detection model with hierarchical and multi-granularity semantic fusion.First,we adopt a character-based sequence labelling method to solve the mismatch problem,in which we devise a Character-Word Fusion Gate to capture the semantic information of words in different segmentation ways.Then we device a Character-Sentence Fusion Gate to learn a character-word-sentence hybrid representation of sequence,which takes the semantic information of the entire sentence into condition and solves the ambiguity problem.Finally,in order to balance the influence the label “O” and the other labels,a loss function with bias is applied to train our model.The experimental results on the widely used ACE2005 dataset show that our approach outperforms at least 3.9%,1.4% and 2.9% than other Chinese event detection models under the metrics of accuracy (Precision,P),recall (Recall,R) and F1.

Key words: Bidirectional long short-term memory model, Chinese event detection, Con-volutional neural network, Information extraction, Multi-granularity semantic fusion, Pre-trained language model

中图分类号: 

  • TP182
[1]AHN D.The stages of event extraction[C]//Proceedings of the Workshop on Annotating and Reasoning about Time and Events.2006:1-8.
[2]YANG H,CHEN Y,LIU K,et al.DCFEE:A document-levelChinese financial event extraction system based on automatically labeled training data[C]//Proceedings of ACL 2018,System Demonstrations.2018:50-55.
[3]PATWARDHAN S,RILOFF E.A unified model of phrasal and sentential evidence for information extraction[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing.2009:151-160.
[4]LIAO S,GRISHMAN R.Using document level cross-event inference to improve event extraction[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.2010:789-797.
[5]HONG Y,ZHANG J,MA B,et al.Using cross-entity inference to improve event extraction[C]//Proceedings of the 49th An-nual Meeting of the Association for Computational Linguistics:Human Language Technologies.2011:1127-1136.
[6]HUANG R,RILOFF E.Modeling textual cohesion for event extraction[C]//Twenty-Sixth AAAI Conference on Artificial Intelligence.2012:1-7.
[7]CHEN Y,XU L,LIU K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Na-tural Language Processing.2015:167-176.
[8]CHEN Y,YANG H,LIU K,et al.Collective event detection via a hierarchical and bias tagging networks with gated multi-level attention mechanisms[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1267-1276.
[9]LIU X,LUO Z,HUANG H.Jointly multiple events extraction via attention-based graph information aggregation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1247-1256.
[10]NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:300-309.
[11]NGUYEN T H,GRISHMAN R.Modeling skip-grams for event detection with convolutional neural networks[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:886-891.
[12]SHA L,QIAN F,CHANG B,et al.Jointly extracting event triggers and arguments by dependency-bridge rnn and tensor-based argument interaction[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018:5916-5923.
[13]LIN H,LU Y,HAN X,et al.Nugget proposal networks for Chi-nese event detection[C]//Proceedings of the 56st Annual Mee-ting of the Association for Computational Linguistics (volume 1:Long papers).2018:1565-1574.
[14]CHEN Z,JI H.Language specific issue and feature exploration in Chinese event extraction[C]//Proceedings of Human Language Technologies:The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics.2009:209-212.
[15]QIN B,ZHAO Y,DING X,et al.Event type recognition based on trigger expansion[J].Tsinghua Science and Technology,2010,15(3):251-258.
[16]LI P,ZHOU G.Employing morphological structures and se-memes for Chinese event extraction[C]//Proceedings of COLING 2012.2012:1619-1634.
[17]GRISHMAN R,WESTBROOK D,MEYERS A.Nyu's english ace 2005 system description[C]//Proccedings of the ACE 2005 Evaluation Workshop.2005.
[18]LI Y,BONTCHEVA K,CUNNINGHAM H.Using unevenmargins SVM and perceptron for information extraction[C]//Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005).2005:72-79.
[19]LI Q,JI H,HUANG L.Joint event extraction via structured prediction with global features[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.2013:73-82.
[20]LU W,ROTH D.Automatic event extraction with structured preference modeling[C]//Proceedings of the 50th Annual Mee-ting of the Association for Computational Linguistics.2012:835-844.
[21]YANG B,MITCHELL T.Joint extraction of events and entities within a document context [C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:289-299.
[22]NGUYEN T H,GRISHMAN R.Event detection and domainadaptation with convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computatio-nal Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2:Short Papers).2015:365-371.
[23]ZHAO Y,JIN X,WANG Y,et al.Document embedding en-hanced event detection with hierarchical and supervised attention[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:414-419.
[24]LIN H,LU Y,HAN X,et al.Cost-sensitive regularization for label confusion-aware event detection[J].arXiv:1906.06003,2019.
[25]XIA Y,LIU Y.Chinese Event Extraction Using Deep NeuralNetwork with Word Embedding[J].arXiv:1610.00842,2016.
[26]ZENG Y,YANG H,FENG Y,et al.A convolution BiLSTMneural network model for Chinese event extraction[M].Cham:Springer,2016:275-287.
[27]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[28]CHEN C,NG V.Joint modeling for chinese event extractionwith rich linguistic features[C]//Proceedings of COLING 2012.2012:529-544.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[3] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[4] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[5] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[6] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[7] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[8] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[9] 刘月红, 牛少华, 神显豪.
基于卷积神经网络的虚拟现实视频帧内预测编码
Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network
计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[10] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[13] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[14] 孙洁琪, 李亚峰, 张文博, 刘鹏辉.
基于离散小波变换的双域特征融合深度卷积神经网络
Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation
计算机科学, 2022, 49(6A): 434-440. https://doi.org/10.11896/jsjkx.210900199
[15] 杨玥, 冯涛, 梁虹, 杨扬.
融合交叉注意力机制的图像任意风格迁移
Image Arbitrary Style Transfer via Criss-cross Attention
计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!