Computer Science ›› 2019, Vol. 46 ›› Issue (5): 214-220.doi: 10.11896/j.issn.1002-137X.2019.05.033

Previous Articles     Next Articles

BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information

FAN Zi-wei, ZHANG Min, LI Zheng-hua   

  1. (School of Computer Sciences and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Published:2019-05-15

Abstract: Implicit discourse relation classification is a sub-task in shallow discourse parsing,and it’s also an important task in natural language processing(NLP).Implicit discourse relation is a logic semantic relation inferred from the argument pairsin discourse relations.The analytical results of the implicit discourse relationship can be applied to many na-tural language processing tasks,such as machine translation,automatic document summarization,and questionanswe-ring system.This paper proposed a method based on self-attention mechanism and syntactic information for the classification task of implicit discourse relations.In this method,Bidirectional Long Short-Term Memory Network (BiLSTM) is used to model the inputted argument pairs with syntactic information and express the argument pairs into low-dimension dense vectors.The argument pair information was screened by the self-attention mechanism.At last,this paper conducted experiments on PDTB2.0 dataset.The experimental results show that the proposed model achieves better effects than the baseline system.

Key words: Neural network, Implicit discourse relation classification, Self-attention mechanism, Syntactic information

CLC Number: 

  • TP391
[1]POPESCU-BELIS A,MEYER T.Using Sense-Labeled Dis-course Connectives forStatistical Machine Translation[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2012:129-138.
[2]JANSEN P,SURDEANU M,CLARK P.Discourse Complements Lexical Semanticsfor Non-factoid Answer Reranking[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:977-986.
[3]LOUIS A,JOSHI A,ENKOVA A.Discourse Indicators forContent Selectionin Summarization[C]∥Proceedings of the Special Interest Group on Discourse and Dialogue.Pennsylvania,USA:Association for Computational Linguistics,2010:147-156.
[4]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]∥Proceedings of the ACL-IJCNLP 2009 Conference Short Papers.Pennsylvania.USA:Association for Computational Linguistics,2009:13-16.
[5]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]∥Proceedings of the International Conference on Language Resources and Evaluation.Paris,France:European Language Resources Association,2008:2961-2968.
[6]EDDY S.Hidden Markov models[J].Current Opinion in Structural Biology,1996,6(3):361-365.
[7]RATNAPARKHI A.A Maximum Entropy Model for Part-of-Speech Tagging[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Pennsylvania.USA:Association for Computational Linguistics,1996:133-142.
[8]COLLINS M.Discriminative Training Methods for HiddenMarkov Models:Theoryand Experiments with Perceptron Algorithms[C]∥Proceedings of the Annual Meeting of the Association for Computational Linguistics.Pennsylvania.USA:Associationfor Computational Linguistics,2002:1-8.
[9]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[M].ACM,2011:1-27[10]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]∥Proceedingsof the International Conference on Machine Learning.Massachusetts,USA:TheInternational Machine Learning Society,2001:282-289.
[11]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2009:683-691.
[12]LIN Z H,KAN M Y,NG H T.Recognizing Implicit Discourse Relations inthe Penn Discourse Treebank[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2009:343-351.
[13]WANG W T,SU J,TAN C L.Kernel Based Discourse Relation Recognition with Temporal Ordering Information[C]∥Procee-dings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2010:710-719.
[14]RUTHERFORD A,XUE N W.Discovering Implicit Discourse Relations Through Brown Cluster pair Representation and Coreference Patterns[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:645-654.
[15]QIN L H,ZHANG Z S,ZHAO H.Shallow Discourse Parsing Using Convolutional Neural Network[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:70-77.
[16]SCHENK N,CHIARCOS C,DONANDT K,et al.Do We Really Need All Those Rich Linguistic Features?A Neural Network-Based Approach to Implicit Sense Labeling[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsyl-vania,USA:Association for Computatio-nal Linguistics,2016:41-49.
[17]WEISS G,BAJEC M.Discourse Sense Classification fromScratch using Focused RNNs[C]∥Proceedings of the Confe-rence on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:50-54.
[18]CHEN J F,ZHANG Q,LIU P F,et al.Implicit Discourseelation Detection via a Deep Architecture with Gated Relevance Network[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2016:1726-1735.
[19]DOZAT T,MANNING C D.Deep Biaffine Attention for Neural Dependency Parsing[C]∥Proceedings of 5th International Conference on Learning Representations.2017:24-26.
[20]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2015:2230-2235.
[21]RUTHERFORD A,XUE N.Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives[C]∥Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Pennsylvania,USA:Association for Computational Linguistics,2015:799-808.
[22]LIU Y,LI S.Recognizing Implicit Discourse Relations via Re-peated Reading:Neural Networks with Multi-Level Attention[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2016:1224-1233.
[23]LIU Y,LI S,ZHANG X,et al.Implicit discourse relation classification via multi-task neural networks[C]∥Thirtieth AAAI Conference on Artificial Intelligence.USA:AAAI Press,2016:2750-2756.
[24]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]∥Proceedings of the 2017 Confe-rence on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2017:1299-1308.
[1] YU Xue-yong, CHEN Tao. Privacy Protection Offloading Algorithm Based on Virtual Mapping in Edge Computing Scene [J]. Computer Science, 2021, 48(1): 65-71.
[2] SHAN Mei-jing, QIN Long-fei, ZHANG Hui-bing. L-YOLO:Real Time Traffic Sign Detection Model for Vehicle Edge Computing [J]. Computer Science, 2021, 48(1): 89-95.
[3] HE Yan-hui, WU Gui-xing, WU Zhi-qiang. Domain Alignment Based Object Detection of X-ray Images [J]. Computer Science, 2021, 48(1): 175-181.
[4] LI Ya-nan, HU Yu-jia, GAN Wei, ZHU Min. Survey on Target Site Prediction of Human miRNA Based on Deep Learning [J]. Computer Science, 2021, 48(1): 209-216.
[5] ZHANG Yan-mei, LOU Yin-cheng. Deep Neural Network Based Ponzi Scheme Contract Detection Method [J]. Computer Science, 2021, 48(1): 273-279.
[6] ZHUANG Shi-jie, YU Zhi-yong, GUO Wen-zhong, HUANG Fang-wan. Short Term Load Forecasting via Zoneout-based Multi-time Scale Recurrent Neural Network [J]. Computer Science, 2020, 47(9): 105-109.
[7] ZHANG Jia-jia, ZHANG Xiao-hong. Multi-branch Convolutional Neural Network for Lung Nodule Classification and Its Interpretability [J]. Computer Science, 2020, 47(9): 129-134.
[8] ZHU Ling-ying, SANG Qing-bing, GU Ting-ting. No-reference Stereo Image Quality Assessment Based on Disparity Information [J]. Computer Science, 2020, 47(9): 150-156.
[9] ZHAO Qin-yan, LI Zong-min, LIU Yu-jie, LI Hua. Cascaded Siamese Network Visual Tracking Based on Information Entropy [J]. Computer Science, 2020, 47(9): 157-162.
[10] YOU Lan, HAN Xue-wei, HE Zheng-wei, XIAO Si-yu, HE Du, PAN Xiao-meng. Improved Sequence-to-Sequence Model for Short-term Vessel Trajectory Prediction Using AIS Data Streams [J]. Computer Science, 2020, 47(9): 169-174.
[11] CUI Tong-tong, WANG Gui-ling, GAO Jing. Ship Trajectory Classification Method Based on 1DCNN-LSTM [J]. Computer Science, 2020, 47(9): 175-184.
[12] LIU Hai-chao, WANG Li. Graph Classification Model Based on Capsule Deep Graph Convolutional Neural Network [J]. Computer Science, 2020, 47(9): 219-225.
[13] CHI Hao-yu, CHEN Chang-bo. Prediction of Loop Tiling Size Based on Neural Network [J]. Computer Science, 2020, 47(8): 62-70.
[14] ZHAO Wei, LIN Yu-ming, WANG Chao-qiang, CAI Guo-yong. Opinion Word-pairs Collaborative Extraction Based on Dependency Relation Analysis [J]. Computer Science, 2020, 47(8): 164-170.
[15] WANG Jiao-jin, JIAN Mu-wei, LIU Xiang-yu, LIN Pei-guang, GEN Lei-lei, CUI Chao-ran, YIN Yi-long. Video Saliency Detection Based on 3D Full ConvLSTM Neural Network [J]. Computer Science, 2020, 47(8): 195-201.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .