计算机科学 ›› 2019, Vol. 46 ›› Issue (8): 244-248.doi: 10.11896/j.issn.1002-137X.2019.08.040

• 人工智能 • 上一篇    下一篇

基于自注意力机制的事件时序关系分类方法

张义杰, 李培峰, 朱巧明   

  1. (苏州大学计算机科学与技术学院 江苏 苏州215006)
    (江苏省计算机信息处理技术重点实验室 江苏 苏州215006)
  • 收稿日期:2018-07-09 发布日期:2019-08-15
  • 通讯作者: 李培峰(1971-),男,教授,博士生导师,CCF会员,主要研究方向为自然语言处理、机器学习,E-mail:pfli@suda.edu.cn
  • 作者简介:张义杰(1994-),男,硕士生,CCF学生会员,主要研究方向为自然语言处理;朱巧明(1963-),男,教授,博士生导师,CCF会员,主要研究方向为中文信息处理
  • 基金资助:
    国家自然科学基金(61472265,61772354,61773276)

Event Temporal Relation Classification Method Based on Self-attention Mechanism

ZHANG Yi-jie, LI Pei-feng, ZHU Qiao-ming   

  1. (School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
    (Province Key Lab of Computer Information Processing Technology of Jiangsu,Suzhou,Jiangsu 215006,China)
  • Received:2018-07-09 Published:2019-08-15

摘要: 事件时序关系分类是事件抽取的重要后续任务。随着深度学习技术的发展,神经网络在事件时序关系分类任务中发挥着重要作用。但是,对于传统的循环神经网络或卷积神经网络而言,处理结构信息和捕获长距离依赖关系仍然是一个重大挑战。针对这个问题,文中提出了一种基于自注意力机制的事件时序关系分类模型架构,它可以直接捕获句子中任意两个词例之间的关系。将该机制与非线性网络层结合,可以使事件时序关系分类的性能得到显著提高。在TimeBank-Dense和Richer Event Description数据集上的对比实验证明:所提方法优于现有的大多数神经网络方法。

关键词: 时序关系, 深度学习, 自注意力机制

Abstract: Classifying temporal relation between events is a significant subsequent study of event extraction.With the development of deep learning,neural network plays a vital role in the task of event temporal relation classification.However,it remains a major challenge for conventional RNNs or CNNs to handle structural information and capture long distance dependence relations.To address this issue,this paper proposed a neural architecture for event temporal relation classification based on self-attention mechanism,which can directly capture relationships between two arbitrary tokens.The classification performance is improved significantly through combing this mechanism with nonlinear layers.The contrast experiments on TimeBank-Dense and Richer Event Description datasets prove that the proposed method outperforms most of the existing neural methods.

Key words: Temporal relation, Deep learning, Self-attention mechanism

中图分类号: 

  • TP391.1
[1] LIN J,YUAN C F.Extraction and Computation of Chinese Temporal Relation[J].Journal of Chinese Information Processing,2009,23(5):62-67.(in Chinese) 林静,苑春法.汉语时间关系抽取与计算[J].中文信息学报,2009,23(5):62-67.
[2] ZHONG Z M,LIU Z T,ZHOU W,et al.The Model of Event Relation Representation[J].Journal of Chinese Information Processing,2009,23(6):56-60.(in Chinese) 仲兆满,刘宗田,周文,等.事件关系表示模型[J].中文信息学报,2009,23(6):56-60.
[3] WANG F E,TAN H Y,QIAN Y L.Recognition of Temporal Relation in One Sentence Based on Maximum Entropy[J].Computer Engineering,2012,38(4):37-39.(in Chinese) 王风娥,谭红叶,钱揖丽.基于最大熵的句内时间关系识别[J].计算机工程,2012,38(4):37-39.
[4] MARCU D,ECHIHABI A.Anunsupervised approach to recognizing discourse relations[C]∥Proceedings of the Association for Computational Linguistics.Association for Computational Linguistics,2002:368-375.
[5] MANI I,VERHAGEN M,WELLNER B,et al.Machine lear- ning of temporal relations[C]∥Proceedings of the Association for Computational Linguistics.Association for Computational Linguistics,2006:753-760.
[6] CHAMBERS N,WANG S,JURAFSKY D.Classifying temporal relations between events[C]∥Proceeding of the ACL on Inte-ractive Poster and Demonstration Sessions.Association for Computational Linguistics,2007:173-176.
[7] LI P F,ZHU Q M,ZHOU G D,et al.Global Inference to Chinese Temporal Relation Extraction[C]∥Proceedings of the International Conference on Computational Linguistics.2016:1451-1460.
[8] CHENG F,MIYAO Y.Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths[C]∥Proceedings of the Association for Computational Linguistics(Short Papers).Association for Computational Linguistics.2017:1-6.
[9] MENG Y,RUMSHISKY A,ROMANOV A.Temporal Infor- mation Extraction for Question Answering Using Syntactic Dependencies in an LSTM-based Architecture[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2017:887-896.
[10] CHOUBEY P K,HUANG R H.A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics.2017:1796-1802.
[11] TOURILLE J,FERRET O,TANNIER X,et al.Neural Architecture for Temporal Relation Extraction:A Bi-LSTM Approach for Detecting Narrative Containers[C]∥Proceedings of the Association for Computational Linguistics.Association for Computational Linguistics,2017:224-230.
[12] VASWANI A,SHAZEER N,PARMAR N,et al.Attentionis all you need[J].arXiv:1706.03762.
[13] CHENG J P,DONG L,LAPATA M.Long Short-Term Memory-Networks for Machine Reading[J].arXiv:1601.06733.
[14] LIN Z H,FENG M W,SANTOS C N,et al.A Structured Self-attentive Sentence Embedding[J].arXiv:1703.03130.
[15] PAULUS R,XIONG C M,SOCHER R.A Deep Reinforced Model for Abstractive Summarization[J].arXiv:1705.04304.
[16] SHEN T,ZHOU T Y,LONG G D,et al.DiSAN:Directional Self-Attention Network for RNN/CNN-free Language Understanding[J].arXiv :1709.04696.
[17] DEY R,SALEMT F M.Gate-variants of Gated Recurrent Unit (GRU)neural networks[J].arXiv:1701.05923.
[18] DAUPHIN Y N,FAN A,AULI M,et al.Language Modeling with Gated Convolutional Networks[J].arXiv:1612.08083.
[19] MIRZA P,TONELLI S.On the contribution of word embeddings to temporal relationclassification[C]∥Proceedings of the International Conference on Computational Linguistics.2016:2818-2828.
[1] 周燕, 曾凡智, 吴臣, 罗粤, 刘紫琴. 基于深度学习的三维形状特征提取方法[J]. 计算机科学, 2019, 46(9): 47-58.
[2] 马露, 裴伟, 朱永英, 王春立, 王鹏乾. 基于深度学习的跌倒行为识别[J]. 计算机科学, 2019, 46(9): 106-112.
[3] 李青华, 李翠平, 张静, 陈红, 王绍卿. 深度神经网络压缩综述[J]. 计算机科学, 2019, 46(9): 1-14.
[4] 王嫣然, 陈清亮, 吴俊君. 面向复杂环境的图像语义分割方法综述[J]. 计算机科学, 2019, 46(9): 36-46.
[5] 孙中锋, 王静. 用于基于方面情感分析的RCNN-BGRU-HN网络模型[J]. 计算机科学, 2019, 46(9): 223-228.
[6] 缪永伟, 李高怡, 鲍陈, 张旭东, 彭思龙. 基于卷积神经网络的图像局部风格迁移[J]. 计算机科学, 2019, 46(9): 259-264.
[7] 邓存彬, 虞慧群, 范贵生. 融合动态协同过滤和深度学习的推荐算法[J]. 计算机科学, 2019, 46(8): 28-34.
[8] 杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学, 2019, 46(8): 1-8.
[9] 郭旭, 朱敬华. 基于用户向量化表示和注意力机制的深度神经网络推荐模型[J]. 计算机科学, 2019, 46(8): 111-115.
[10] 李舟军,王昌宝. 基于深度学习的机器阅读理解综述[J]. 计算机科学, 2019, 46(7): 7-12.
[11] 张琳娜,陈建强,陈晓玲,岑翼刚,阚世超. 面向行车视频目标实时检测的轻量级SSD网络[J]. 计算机科学, 2019, 46(7): 233-237.
[12] 李健, 杨祥如, 何斌. 基于深度学习的几何特征匹配方法[J]. 计算机科学, 2019, 46(7): 274-279.
[13] 刘梦娟,曾贵川,岳威,仇笠舟,王加昌. 面向展示广告的点击率预测模型综述[J]. 计算机科学, 2019, 46(7): 38-49.
[14] 陈思文, 刘玉江, 刘冬, 苏晨, 赵地, 钱林学, 张佩珩. 基于AlexNet模型和自适应对比度增强的乳腺结节超声图像分类[J]. 计算机科学, 2019, 46(6A): 146-152.
[15] 霍星, 费志伟, 赵峰, 邵堃. 深度学习在驾驶员安全带检测中的应用[J]. 计算机科学, 2019, 46(6A): 182-187.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[3] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[4] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[5] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[6] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[7] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105, 130 .
[8] 王帅,刘娟,毕姚姚,陈哲,郑群花,段慧芳. 基于两步聚类和随机森林的乳腺腺管自动识别方法[J]. 计算机科学, 2018, 45(3): 247 -252 .
[9] 李珊,饶文碧. 基于视频的矿井中人体运动区域检测[J]. 计算机科学, 2018, 45(4): 291 -295 .
[10] 韩奎奎,谢在鹏,吕鑫. 一种基于改进遗传算法的雾计算任务调度策略[J]. 计算机科学, 2018, 45(4): 137 -142 .