Computer Science ›› 2019, Vol. 46 ›› Issue (5): 214-220.doi: 10.11896/j.issn.1002-137X.2019.05.033

Previous Articles     Next Articles

BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information

FAN Zi-wei, ZHANG Min, LI Zheng-hua   

  1. (School of Computer Sciences and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Published:2019-05-15

Abstract: Implicit discourse relation classification is a sub-task in shallow discourse parsing,and it’s also an important task in natural language processing(NLP).Implicit discourse relation is a logic semantic relation inferred from the argument pairsin discourse relations.The analytical results of the implicit discourse relationship can be applied to many na-tural language processing tasks,such as machine translation,automatic document summarization,and questionanswe-ring system.This paper proposed a method based on self-attention mechanism and syntactic information for the classification task of implicit discourse relations.In this method,Bidirectional Long Short-Term Memory Network (BiLSTM) is used to model the inputted argument pairs with syntactic information and express the argument pairs into low-dimension dense vectors.The argument pair information was screened by the self-attention mechanism.At last,this paper conducted experiments on PDTB2.0 dataset.The experimental results show that the proposed model achieves better effects than the baseline system.

Key words: Neural network, Implicit discourse relation classification, Self-attention mechanism, Syntactic information

CLC Number: 

  • TP391
[1]POPESCU-BELIS A,MEYER T.Using Sense-Labeled Dis-course Connectives forStatistical Machine Translation[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2012:129-138.
[2]JANSEN P,SURDEANU M,CLARK P.Discourse Complements Lexical Semanticsfor Non-factoid Answer Reranking[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:977-986.
[3]LOUIS A,JOSHI A,ENKOVA A.Discourse Indicators forContent Selectionin Summarization[C]∥Proceedings of the Special Interest Group on Discourse and Dialogue.Pennsylvania,USA:Association for Computational Linguistics,2010:147-156.
[4]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]∥Proceedings of the ACL-IJCNLP 2009 Conference Short Papers.Pennsylvania.USA:Association for Computational Linguistics,2009:13-16.
[5]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]∥Proceedings of the International Conference on Language Resources and Evaluation.Paris,France:European Language Resources Association,2008:2961-2968.
[6]EDDY S.Hidden Markov models[J].Current Opinion in Structural Biology,1996,6(3):361-365.
[7]RATNAPARKHI A.A Maximum Entropy Model for Part-of-Speech Tagging[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Pennsylvania.USA:Association for Computational Linguistics,1996:133-142.
[8]COLLINS M.Discriminative Training Methods for HiddenMarkov Models:Theoryand Experiments with Perceptron Algorithms[C]∥Proceedings of the Annual Meeting of the Association for Computational Linguistics.Pennsylvania.USA:Associationfor Computational Linguistics,2002:1-8.
[9]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[M].ACM,2011:1-27[10]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]∥Proceedingsof the International Conference on Machine Learning.Massachusetts,USA:TheInternational Machine Learning Society,2001:282-289.
[11]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2009:683-691.
[12]LIN Z H,KAN M Y,NG H T.Recognizing Implicit Discourse Relations inthe Penn Discourse Treebank[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2009:343-351.
[13]WANG W T,SU J,TAN C L.Kernel Based Discourse Relation Recognition with Temporal Ordering Information[C]∥Procee-dings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2010:710-719.
[14]RUTHERFORD A,XUE N W.Discovering Implicit Discourse Relations Through Brown Cluster pair Representation and Coreference Patterns[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:645-654.
[15]QIN L H,ZHANG Z S,ZHAO H.Shallow Discourse Parsing Using Convolutional Neural Network[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:70-77.
[16]SCHENK N,CHIARCOS C,DONANDT K,et al.Do We Really Need All Those Rich Linguistic Features?A Neural Network-Based Approach to Implicit Sense Labeling[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsyl-vania,USA:Association for Computatio-nal Linguistics,2016:41-49.
[17]WEISS G,BAJEC M.Discourse Sense Classification fromScratch using Focused RNNs[C]∥Proceedings of the Confe-rence on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:50-54.
[18]CHEN J F,ZHANG Q,LIU P F,et al.Implicit Discourseelation Detection via a Deep Architecture with Gated Relevance Network[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2016:1726-1735.
[19]DOZAT T,MANNING C D.Deep Biaffine Attention for Neural Dependency Parsing[C]∥Proceedings of 5th International Conference on Learning Representations.2017:24-26.
[20]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2015:2230-2235.
[21]RUTHERFORD A,XUE N.Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives[C]∥Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Pennsylvania,USA:Association for Computational Linguistics,2015:799-808.
[22]LIU Y,LI S.Recognizing Implicit Discourse Relations via Re-peated Reading:Neural Networks with Multi-Level Attention[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2016:1224-1233.
[23]LIU Y,LI S,ZHANG X,et al.Implicit discourse relation classification via multi-task neural networks[C]∥Thirtieth AAAI Conference on Artificial Intelligence.USA:AAAI Press,2016:2750-2756.
[24]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]∥Proceedings of the 2017 Confe-rence on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2017:1299-1308.
[1] ZHOU Yan, ZENG Fan-zhi, WU Chen, LUO Yue, LIU Zi-qin. 3D Shape Feature Extraction Method Based on Deep Learning [J]. Computer Science, 2019, 46(9): 47-58.
[2] MA Lu, PEI Wei, ZHU Yong-ying, WANG Chun-li, WANG Peng-qian. Fall Action Recognition Based on Deep Learning [J]. Computer Science, 2019, 46(9): 106-112.
[3] LI Qing-hua, LI Cui-ping, ZHANG Jing, CHEN Hong, WANG Shao-qing. Survey of Compressed Deep Neural Network [J]. Computer Science, 2019, 46(9): 1-14.
[4] WANG Xin, MENG Hao-hao, JIANG Xiao-tao, CHEN Sheng-yong, SUN Ling-yun. Survey on Character Motion Synthesis Based on Neural Network [J]. Computer Science, 2019, 46(9): 22-27.
[5] WANG Yan-ran, CHEN Qing-liang, WU Jun-jun. Research on Image Semantic Segmentation for Complex Environments [J]. Computer Science, 2019, 46(9): 36-46.
[6] SUN Zhong-feng, WANG Jing. RCNN-BGRU-HN Network Model for Aspect-based Sentiment Analysis [J]. Computer Science, 2019, 46(9): 223-228.
[7] MIAO Yong-wei, LI Gao-yi, BAO Chen, ZHANG Xu-dong, PENG Si-long. Image Localized Style Transfer Based on Convolutional Neural Network [J]. Computer Science, 2019, 46(9): 259-264.
[8] JIANG Ze-tao, QIN Jia-qi, HU Shuo. Multi-spectral Scene Recognition Method Based on Multi-way Convolution Neural Network [J]. Computer Science, 2019, 46(9): 265-270.
[9] SHI Xiao-hong, HUANG Qin-kai, MIAO Jia-xin, SU Zhuo. Edge-preserving Filtering Method Based on Convolutional Neural Networks [J]. Computer Science, 2019, 46(9): 277-283.
[10] GUO Xu, ZHU Jing-hua. Deep Neural Network Recommendation Model Based on User Vectorization Representation and Attention Mechanism [J]. Computer Science, 2019, 46(8): 111-115.
[11] ZHANG Yi-jie, LI Pei-feng, ZHU Qiao-ming. Event Temporal Relation Classification Method Based on Self-attention Mechanism [J]. Computer Science, 2019, 46(8): 244-248.
[12] YU Yang, LI Shi-jie, CHEN Liang, LIU Yun-ting. Ship Target Detection Based on Improved YOLO v2 [J]. Computer Science, 2019, 46(8): 332-336.
[13] MAI Ying-chao,CHEN Yun-hua,ZHANG Ling. Bio-inspired Activation Function with Strong Anti-noise Ability [J]. Computer Science, 2019, 46(7): 206-210.
[14] WANG Le-le,WANG Bin-qiang,LIU Jian-gang,ZHANG Jian-hui,MIAO Qi-guang. Study on Malicious Program Detection Based on Recurrent Neural Network [J]. Computer Science, 2019, 46(7): 86-90.
[15] WANG Ya-hui, LIU Bo, YUAN Xiao-tong. Distributed Convolutional Neural Networks Based on Approximate Newton-type Mothod [J]. Computer Science, 2019, 46(7): 180-185.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[2] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .
[3] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105, 130 .
[4] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111, 142 .
[5] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[6] YANG Yu-qi, ZHANG Guo-an and JIN Xi-long. Dual-cluster-head Routing Protocol Based on Vehicle Density in VANETs[J]. Computer Science, 2018, 45(4): 126 -130 .
[7] PANG Bo, JIN Qian-kun, HENIGULI·Wu Mai Er and QI Xing-bin. Routing Scheme Based on Network Slicing and ILP Model in SDN[J]. Computer Science, 2018, 45(4): 143 -147 .
[8] GUO Shuai, LIU Liang and QIN Xiao-lin. Spatial Keyword Range Query with User Preferences Constraint[J]. Computer Science, 2018, 45(4): 182 -189 .
[9] QIN Ke-yun and LIN Hong. Relationships among Several Attribute Reduction Methods of Decision Formal Context[J]. Computer Science, 2018, 45(4): 257 -259, 290 .
[10] LV Tao and HAO Yong-tao. Study on K-line Patterns’ Profitability Based on Similarity Match and Clustering[J]. Computer Science, 2018, 45(3): 182 -188 .