计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 214-220.doi: 10.11896/j.issn.1002-137X.2019.05.033

• 人工智能 • 上一篇    下一篇

基于BiLSTM并结合自注意力机制和句法信息的隐式篇章关系分类

凡子威, 张民, 李正华   

  1. (苏州大学计算机科学与技术学院 江苏 苏州215006)
  • 发布日期:2019-05-15
  • 作者简介:凡子威(1992-),男,硕士生,主要研究方向为自然语言处理,E-mail:1204493155@qq.com;张 民(1970-),男,博士,教授,主要研究方向为自然语言处理、机器翻译、人工智能;李正华(1983-),男,博士,副教授,主要研究方向为自然语言处理、句法分析、中文分词、语义分析、篇章分析,E-mail:zhli13@suda.edu.cn(通信作者)。
  • 基金资助:
    国家自然科学基金项目(61525205,61876116)资助。

BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information

FAN Zi-wei, ZHANG Min, LI Zheng-hua   

  1. (School of Computer Sciences and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Published:2019-05-15

摘要: 隐式篇章关系分类是浅层篇章结构分析(Shallow Discourse Parsing)中的子任务,也是自然语言处理(Natural Language Processing,NLP)中的一项重要任务。隐式篇章关系是由篇章关系中的论元对推理出来的逻辑语义关系。隐式篇章关系的分析结果可以应用于许多自然语言处理任务中,如机器翻译、自动文档摘要、问答系统等。针对隐式篇章关系分类任务,提出一种基于自注意力机制和句法信息的方法。通过双向长短时记忆网络(Bidirectional Long Short-Term Memory Network)对输入的结合句法信息的论元对进行建模,将论元对表示成低维稠密的向量;通过自注意力机制对论元对信息进行筛选。在PDTB2.0 数据集上进行实验,结果表明该方法较基准系统获得了更好的效果。

关键词: 句法信息, 神经网络, 隐式篇章关系分类, 自注意力机制

Abstract: Implicit discourse relation classification is a sub-task in shallow discourse parsing,and it’s also an important task in natural language processing(NLP).Implicit discourse relation is a logic semantic relation inferred from the argument pairsin discourse relations.The analytical results of the implicit discourse relationship can be applied to many na-tural language processing tasks,such as machine translation,automatic document summarization,and questionanswe-ring system.This paper proposed a method based on self-attention mechanism and syntactic information for the classification task of implicit discourse relations.In this method,Bidirectional Long Short-Term Memory Network (BiLSTM) is used to model the inputted argument pairs with syntactic information and express the argument pairs into low-dimension dense vectors.The argument pair information was screened by the self-attention mechanism.At last,this paper conducted experiments on PDTB2.0 dataset.The experimental results show that the proposed model achieves better effects than the baseline system.

Key words: Implicit discourse relation classification, Neural network, Self-attention mechanism, Syntactic information

中图分类号: 

  • TP391
[1]POPESCU-BELIS A,MEYER T.Using Sense-Labeled Dis-course Connectives forStatistical Machine Translation[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2012:129-138.
[2]JANSEN P,SURDEANU M,CLARK P.Discourse Complements Lexical Semanticsfor Non-factoid Answer Reranking[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:977-986.
[3]LOUIS A,JOSHI A,ENKOVA A.Discourse Indicators forContent Selectionin Summarization[C]∥Proceedings of the Special Interest Group on Discourse and Dialogue.Pennsylvania,USA:Association for Computational Linguistics,2010:147-156.
[4]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]∥Proceedings of the ACL-IJCNLP 2009 Conference Short Papers.Pennsylvania.USA:Association for Computational Linguistics,2009:13-16.
[5]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]∥Proceedings of the International Conference on Language Resources and Evaluation.Paris,France:European Language Resources Association,2008:2961-2968.
[6]EDDY S.Hidden Markov models[J].Current Opinion in Structural Biology,1996,6(3):361-365.
[7]RATNAPARKHI A.A Maximum Entropy Model for Part-of-Speech Tagging[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Pennsylvania.USA:Association for Computational Linguistics,1996:133-142.
[8]COLLINS M.Discriminative Training Methods for HiddenMarkov Models:Theoryand Experiments with Perceptron Algorithms[C]∥Proceedings of the Annual Meeting of the Association for Computational Linguistics.Pennsylvania.USA:Associationfor Computational Linguistics,2002:1-8.
[9]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[M].ACM,2011:1-27[10]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]∥Proceedingsof the International Conference on Machine Learning.Massachusetts,USA:TheInternational Machine Learning Society,2001:282-289.
[11]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2009:683-691.
[12]LIN Z H,KAN M Y,NG H T.Recognizing Implicit Discourse Relations inthe Penn Discourse Treebank[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2009:343-351.
[13]WANG W T,SU J,TAN C L.Kernel Based Discourse Relation Recognition with Temporal Ordering Information[C]∥Procee-dings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2010:710-719.
[14]RUTHERFORD A,XUE N W.Discovering Implicit Discourse Relations Through Brown Cluster pair Representation and Coreference Patterns[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:645-654.
[15]QIN L H,ZHANG Z S,ZHAO H.Shallow Discourse Parsing Using Convolutional Neural Network[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:70-77.
[16]SCHENK N,CHIARCOS C,DONANDT K,et al.Do We Really Need All Those Rich Linguistic Features?A Neural Network-Based Approach to Implicit Sense Labeling[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsyl-vania,USA:Association for Computatio-nal Linguistics,2016:41-49.
[17]WEISS G,BAJEC M.Discourse Sense Classification fromScratch using Focused RNNs[C]∥Proceedings of the Confe-rence on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:50-54.
[18]CHEN J F,ZHANG Q,LIU P F,et al.Implicit Discourseelation Detection via a Deep Architecture with Gated Relevance Network[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2016:1726-1735.
[19]DOZAT T,MANNING C D.Deep Biaffine Attention for Neural Dependency Parsing[C]∥Proceedings of 5th International Conference on Learning Representations.2017:24-26.
[20]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2015:2230-2235.
[21]RUTHERFORD A,XUE N.Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives[C]∥Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Pennsylvania,USA:Association for Computational Linguistics,2015:799-808.
[22]LIU Y,LI S.Recognizing Implicit Discourse Relations via Re-peated Reading:Neural Networks with Multi-Level Attention[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2016:1224-1233.
[23]LIU Y,LI S,ZHANG X,et al.Implicit discourse relation classification via multi-task neural networks[C]∥Thirtieth AAAI Conference on Artificial Intelligence.USA:AAAI Press,2016:2750-2756.
[24]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]∥Proceedings of the 2017 Confe-rence on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2017:1299-1308.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[4] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 王润安, 邹兆年.
基于物理操作级模型的查询执行时间预测方法
Query Performance Prediction Based on Physical Operation-level Models
计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[7] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 彭双, 伍江江, 陈浩, 杜春, 李军.
基于注意力神经网络的对地观测卫星星上自主任务规划方法
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[13] 费星瑞, 谢逸.
基于HMM-NN的用户点击流识别
Click Streams Recognition for Web Users Based on HMM-NN
计算机科学, 2022, 49(7): 340-349. https://doi.org/10.11896/jsjkx.210600127
[14] 赵冬梅, 吴亚星, 张红斌.
基于IPSO-BiLSTM的网络安全态势预测
Network Security Situation Prediction Based on IPSO-BiLSTM
计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103
[15] 齐秀秀, 王佳昊, 李文雄, 周帆.
基于概率元学习的矩阵补全预测融合算法
Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning
计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!