计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 157-163.doi: 10.11896/jsjkx.190300115

• 人工智能 • 上一篇    下一篇

基于深度学习的隐式篇章关系识别综述

胡超文1, 杨亚连2, 邬昌兴1   

  1. 1 华东交通大学虚拟现实与交互技术研究院 南昌330013;
    2 华东交通大学软件学院 南昌330013
  • 收稿日期:2019-03-24 出版日期:2020-04-15 发布日期:2020-04-15
  • 通讯作者: 邬昌兴(wcxnlp@163.com)
  • 基金资助:
    国家自然科学基金项目(61866012);江西省自然科学基金资助项目(20181BAB202012);江西省教育厅科学技术研究项目(GJJ180329)

Survey of Implicit Discourse Relation Recognition Based on Deep Learning

HU Chao-wen1, YANG Ya-lian2, WU Chang-xing1   

  1. 1 Virtual Reality and Interactive Techniques Institute,East China Jiaotong University,Nanchang 330013,China;
    2 School of Software,East China Jiaotong University,Nanchang 330013,China
  • Received:2019-03-24 Online:2020-04-15 Published:2020-04-15
  • Contact: WU Chang-xing,born in 1981,Ph.D,lecturer,is a member of CCF.His main research interests include nature language processing and machine learning.
  • About author:HU Chao-wen,born in 1993,master candidate.His main research interests include deep learning and nature language processing.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(61866012),Natural Science Foundation of Jiangxi Province (20181BAB202012) and Science and Technology Research Project of Jiangxi Education Department(GJJ180329)

摘要: 隐式篇章关系识别是自然语言处理中一项富有挑战性的任务,旨在判断缺少连接词的两个论元(子句或者句子)之间的语义关系(例如转折)。近年来,随着深度学习在自然语言处理领域的广泛应用,各种基于深度学习的隐式篇章关系识别方法取得了不错的效果,其性能全面超越了早期基于人工特征的方法。文中分三大类对最近的隐式篇章关系识别方法进行讨论:基于论元编码的方法、基于论元交互的方法和引入显式篇章数据的半监督方法。在PDTB数据集上的实验结果显示:1)通过显式地建模论元中词或文本片段之间的语义关系,基于论元交互的方法的性能明显好于基于论元编码的方法;2)引入显式篇章数据的半监督方法能有效地缓解数据稀疏问题,从而进一步提升识别的性能。最后,分析了当前面临的主要问题,并指出了未来可能的研究方向。

关键词: 深度学习, 隐式篇章关系识别, 自然语言处理

Abstract: Implicit discourse relation recognition is still a challenging task in natural language processing.It aims to discover the semantic relations (such as transition) between two arguments (e.g.clauses or sentences) where discourse connectives are absent.In recent years,with the extensive application of deep learning in natural language processing,various methods based on deep learning have achieved promising results on implicit discourse relation recognition.Their performance is much better than that of previous methods based on manual features.This paper discussed recent implicit discourse recognition methods in three categories:argument encoding based methods,argument interaction based methods and semi-supervised methods with explicit discourse data.Results on the PDTB data set show that,by explicitly modeling the semantic relation between words or text spans in two arguments,the performance of argument interaction based methods is significantly better than that of argument encoding based methods,and by incorporating explicit discourse data,the semi-supervised methods can effectively alleviate the problem of data sparsity,and then further improve the recognition performance.Lastly,this paper analyzed the major problems faced at pre-sent,and pointed out the possible research directions.

Key words: Deep learning, Implicit discourse relation recognition, Natural language processing

中图分类号: 

  • TP391.1
[1]PITLER E,NENKOVA A.Revisiting Readability:A UnifiedFramework for Predicting Text Quality[C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing.Honolulu,Hawaii:Association for Computational Linguistics,2008:186-195.
[2]LIN Z,NG H T,KAN M Y.Automatically Evaluating Text Coherence Using Discourse Relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1.Association for Computational Linguistics,2011:997-1006.
[3]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]//Proceedings of the ACL-IJCNLP 2009.Suntec,Singapore:Association for Computational Linguistics,2009:13-16.
[4]LI Y C,SUN J,ZHOU G D.Automatic Recognition and Classification on Chinese Discourse Connective[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2015,51(2):307-314.
[5]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]//Proceedings of the 6th Conference of the Language Resources and Evaluation.2008:2961-2968.
[6]RUTHERFORD A,DEMBERG V,XUE N.A Systematic Studyof Neural Discourse Models for Implicit Discourse Relation[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:281-291.
[7]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]//Proceedings of the 2017 Confe-rence on Empirical Methods in Natural Language Processing.Copenhagen,Denmark:Association for Computational Linguistics,2017:1299-1308.
[8]WU C X.Semi-supervised Implicit Discourse Relation Recognition[D].Xiamen:Xiamen Uuniversity,2017.
[9]TU M,ZHOU Y,ZONG C.Enhancing Grammatical Cohesion:Generating Transitional Expressions for SMT[C]//Proceedings of the 52nd Annual Meeting of the Association for Computatio-nal Linguistics.2014:850-860.
[10]ZHOU L,LI B,GAO W,et al.Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities[C]//Proceedings of the Conference on Empirical Me-thods in Natural Language Processing.Association for Computational Linguistics,2011:162-171.
[11]VERBERNE S,BOVES L,OOSTDIJK N,et al.Evaluating Discourse-based Answer Extraction for Why Question Answering[C]//Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’07).Amsterdam,ACM Press,2007:735.
[12]MARCU D,ECHIHABI A.An Unsupervised Approach to Recognizing Discourse Relations[C]//Proceedings of 40th Annual Meeting of The Association for Computational Linguistics.Philadelphia,Pennsylvania,USA:Association for Computational Linguistics,2002:368-375.
[13]RUTHERFORD A,XUE N.Discovering Implicit Discourse Relations Through Brown Cluster Pair Representation and Coreference Patterns[C]//Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.Gothenburg,Sweden:Association for Computational Linguistics,2014:645-654.
[14]WELLNER B,PUSTEJOVSKY J,HAVASI C,et al.Classification of Discourse Coherence Relations:An Exploratory Study using Multiple Knowledge Sources[C]//Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue.Sydney,Austra-lia:Association for Computational Linguistics,2006:117-125.
[15]LIN Z,KAN M Y,NG H T.Recognizing Implicit Discourse Relations in the Penn Discourse Treebank[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing.Singapore:Association for Computational Linguistics,2009:343-351.
[16]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.Suntec,Singapore:Association for Computational Linguistics,2009:683-691.
[17]XU F,ZHU Q M,ZHOU G D.Implicit Discourse Relation Recognition Based on Tree Kernel[J].Acta Journal of Software,2013,24(5):1022-1035.
[18]ZHANG M Y,SONG Y,QIN B,et al.Chinese Discourse Relation Recognition[J].Journal of Chinese Information processing,2013,27(6):51-58.
[19]BAI H,ZHAO H.Deep Enhanced Representation for ImplicitDiscourse Relation Recognition[C]//Proceedings of the 27th International Conference on Computational Linguistics.Santa Fe,New Mexico,USA:Association for Computational Linguistics,2018:571-583.
[20]DAI Z,HUANG R.Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans,Louisiana:Association for Computational Linguistics,2018:141-151.
[21]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon,Portugal:Association for Computational Linguistics,2015:2230-2235.
[22]SAK H,SENIOR A,BEAUFAYS F.Long Short-term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling[C]//Fifteenth Annual Conference of the International Speech Communication Association.2014:338-342.
[23]JI Y,EISENSTEIN J.One Vector is Not Enough:Entity-Augmented Distributed Semantics for Discourse Relations[J].Transactions of the Association for Computational Linguistics,2015,3:329-344.
[24]WANG Y,LI S,YANG J,et al.Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification[C]//Proceedings of the Eighth International Joint Conference on Natural Language Processing.2017:496-505.
[25]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations From Tree-Structured Long Short-Term Memory Networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Beijing,China:Association for Computational Linguistics,2015:1556-1566.
[26]QIN L,ZHANG Z,ZHAO H.Implicit Discourse Relation Recognition with Context-aware Character-enhanced Embeddings[C]//the 26th International Conference on Computational Linguistics.Osaka,Japan:The COLING 2016 Organizing Committee,2016:1914-1924.
[27]ZHANG B,XIONG D,SU J,et al.Learning Better Discourse Representation for Implicit Discourse Relation Recognition via Attention Networks[J].Neurocomputing,2018,275:1241-1249.
[28]KISHIMOTO Y,MURAWAKI Y,KUROHASHI S.A Knowledge-Augmented Neural Network Model for Implicit Discourse Relation Classification[C]//Proceedings of the 27th International Conference on Computational Linguistics.Santa Fe,New Mexico,USA:Association for Computational Linguistics,2018:584-595.
[29]CHEN J,ZHANG Q,LIU P,et al.Implicit Discourse Relation Detection via a Deep Architecture with Gated Relevance Network[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin,Germany:Association for Computational Linguistics,2016:1726-1735.
[30]LEI W,WANG X,LIU M,et al.SWIM:A Simple Word Interaction Model for Implicit Discourse Relation Recognition[C]//International Joint Conferences on Artificial Intelligence Organization.2017:4026-4032.
[31]LIU Y,LI S,ZHANG X,et al.Implicit Discourse Relation Classification via Multi-task Neural Networks[C]//Thirtieth AAAI Conference on Artificial Intelligence.2016:2750-2756.
[32]LIU Y,LI S.Recognizing Implicit Discourse Relations via Repeated Reading:Neural Networks with Multi-Level Attention[C]//Association for Computational Linguistics.2016:1224-1233.
[33]RÖNNQVIST S,SCHENK N,CHIARCOS C.A RecurrentNeural Model with Attention for the Recognition of Chinese Implicit Discourse Relations[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada:Association for Computational Linguistics,2017:256-262.
[34]WU W,WANG H,LIU T,et al.Phrase-level Self-AttentionNetworks for Universal Sentence Encoding[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:3729-3738.
[35]SPORLEDER C,LASCARIDES A.Using Automatically Labelled Examples to Classify Rhetorical Relations:an Assessment[J].Natural Language Engineering,2008,14(3):369-416.
[36]RUTHERFORD A,XUE N.Improving the Inference of Implicit Discourse Relations Via Classifying Explicit Discourse Connectives[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2015:799-808.
[37]WU C,SHI X,SU J,et al.Co-training for Implicit Discourse Relation Recognition Based on Manual and Distributed Features[J].Neural Processing Letters,2017,46(1):233-250.
[38]XU Y,HONG Y,RUAN H,et al.Using Active Learning to Expand Training Data for Implicit Discourse Relation Recognition[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:725-731.
[39]CARLSON L,MARCU D,OKUROWSKI M E.Building a discourse-tagged corpus in the framework of rhetorical structure theory//Current and new directions in discourse and dialogue.Dordrecht:Springer,2003:85-112.
[40]WU C,SHI X D,CHEN Y,et al.Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin,Texas:Association for Computational Linguistics,2016:2306-2312.
[41]PAN S J,YANG Q.A Survey on Transfer Learning[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[42]MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and Their Compositio-nality[C]//Advances in Neural Information Processing Systems.2013:3111-3119.
[43]PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).Doha,Qatar:Association for Computational Linguistics,2014:1532-1543.
[44]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextualized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans,Louisiana:Association for Computational Linguistics,2018:2227-2237.
[45]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving Language Understanding by Generative Pre-training[OL].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language unsupervised/ language understanding paper.pdf,2018.
[46]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018.
[47]BRAUD C,DENIS P.Learning Connective-based Word Representations for Implicit Discourse Relation Identification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin,Texas:Association for Computational Linguistics,2016:203-213.
[48]TURNEY P D,PANTEL P.From Frequency to Meaning:Vector Space Models of Semantics[J].Journal of Artificial Intelligence Research,2010,37:141-188.
[49]WU C,SHI X D,CHEN Y,et al.Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings[C]//Association for Computational Linguistics.2017:269-274.
[50]PARK J,CARDIE C.Improving Implicit Discourse RelationRecognition Through Feature Set Optimization[C]// Procee-dings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Association for Computational Linguistics,2012:108-112.
[51]MCCANN B,BRADBURY J,XIONG C,et al.Learned inTranslation:Contextualized Word Vectors[C]//Advances in Neural Information Processing Systems.2017:6294-6305.
[52]HOWARD J,RUDER S.Universal Language Model Fine-tuning for Text Classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:328-339.
[53]PARK J,CARDIE C.Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).Doha,Qatar:Association for Computational Linguistics,2014:2105-2114.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[9] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[15] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!