Computer Science ›› 2021, Vol. 48 ›› Issue (11): 319-326.doi: 10.11896/jsjkx.201000099

• Artificial Intelligence • Previous Articles     Next Articles

Knowledge Distillation Based Implicit Discourse Relation Recognition

YU Liang, WEI Yong-feng, LUO Guo-liang, WU Chang-xing   

  1. School of Software,East China Jiaotong University,Nanchang 330013,China
  • Received:2020-10-18 Revised:2021-03-10 Online:2021-11-15 Published:2021-11-10
  • About author:YU Liang,born in 1996,postgraduate,is a member of China Computer Federation.His main research interests include natural language processing and deep learning.
    WU Chang-xing,born in 1981.Ph.D,lecturer,is a member of China Compu-ter Federation.His main research interests include nature language processing and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61866012), Natural Science Foundation of Jiangxi Province(20181BAB202012) and Science and Technology Research Project of Jiangxi Education Department(GJJ180329).

Abstract: Due to the lack of connectives,implicit discourse relation recognition models infer the semantic relations (e.g.,causal) between two arguments (clauses or sentences) based on their semantics.The performance of these models is still relatively low.It is also very difficult for corpus annotators to annotate implicit discourse relations.They usually insert an appropriate connective to assist the annotation of an implicit discourse relation instance.Considering the above,a knowledge distillation based method is proposed for implicit discourse relation recognition to take use of the connectives inserted during corpus annotating.Specifically,a connective-enhanced model is constructed to integrate the connective information,and then the integrated connective information is transferred to the implicit discourse relation recognition model via knowledge distillation.Experimental results on the commonly used PDTB dataset show that the proposed method achieves better performance than the baselines.

Key words: Connective, Deep learning, Discourse structure analysis, Implicit discourse relation recognition, Knowledge distillation

CLC Number: 

  • TP391.1
[1]LI Y,FENG W,SUN J,et al.Building Chinese Discourse Corpuswith Connective-driven Dependency Tree Structure[C]//Proceedings of EMNLP 2014.2014:2105-2114.
[2]ZHANG L,XING Y,KONG F,et al.A Top-down Neural Architecture towards Text-level Parsing of Discourse Rhetorical Structure[C]//Proceedings of ACL 2020.2020:6386-6395.
[3]HU C W,YANG Y L,WU C X.An Overview of Implicit Discourse Relation Recognition Based on Deep Learning[J].Computer Science,2020,47(4):157-163.
[4]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]//Proceedings of ACL-IJCNLP 2009.2009:13-16.
[5]KISHIMOTO Y,MURAWAKI Y,KUROHASHI S.AdaptingBERT to Implicit Discourse Relation Classification with a Focus on Discourse Connectives[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:1152-1158.
[6]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]//Proceedings of the Sixth International Conference on Language Resources and Evaluation.2008.
[7]ZHOU Z M,XU Y,NIU Z Y,et al.Predicting Discourse Connectives for Implicit Discourse Relation Recognition[C]//Proceedings of COLING 2010.2010:1507-1514.
[8]QIN L,ZHANG Z,ZHAO H,et al.Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification[C]//Proceedings of ACL 2017.2017:1006-1017.
[9]BAI H,ZHAO H.Deep Enhanced Representation for ImplicitDiscourse Relation Recognition[C]//Proceedings of COLING 2018.2018:571-583.
[10]NGUYEN L T,NGO L V,THAN K,et al.Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings[C]//Proceedings of ACL 2019.2019:4201-4207.
[11]WU C,HU C,LI R,et al.Hierarchical Multi-task Learning with CRF for Implicit Discourse Relation Recognition[J].Know-ledge-Based Systems,2020,195(5-6).
[12]ZENG J,LIU Y,SU J,et al.Iterative Dual Domain Adaptation for Neural Machine Translation[C]//Proceedings of EMNLP 2019.2019:845-855.
[13]LIU Y,CHEN K,LIU C,et al.Structured Knowledge Distillation for Semantic Segmentation[C]//Proceedings of CVPR 2019.2019:2604-2613.
[14]HINTON G,VINYALS O,DEAN J.Distilling the Knowledgein a Neural Network[C]//Proceedings of NIPS 2014 Deep Learning Workshop.2015:1-9.
[15]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]//Proceedings of ACL 2009.2009:683-691.
[16]LI S,KONG F,ZHOU G D.Implicit Discourse Relation Recognition Based on PDTB System[J].Journal of Chinese Information Processing,2016,30(4):81-89.
[17]LIN Z,KAN M Y,NG H T.Recognizing Implicit Discourse Relations in the Penn Discourse Treebank[C]//Proceedings of EMNLP 2009.2009:343-351.
[18]LOUIS A,JOSHI A,PRASAD R,et al.Using Entity Features to Classify Implicit Discourse Relations[C]//Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue.2010:59-62.
[19]RUTHERFORD A,XUE N.Discovering Implicit Discourse Relations through Brown Cluster Pair Representation andCorefe-rence Patterns[C]//Proceedings of EACL 2014.2014:645-654.
[20]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]//Proceedings of EMNLP 2015.2015:2230-2235.
[21]JI Y,EISENSTEIN J.One Vector is Not Enough:Entity-Augmented Distributed Semantics for Discourse Relations[J].Transactions of the Association for Computational Linguistics,2015,3:329-344.
[22]DAI Z,HUANG R.Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph[C]//Proceedings of NAACL 2018.2018:141-151.
[23]FAN Z W,ZHANG M,LI Z H.Implicit Discourse RelationClassification Based on BiLSTM Combined with Self -Attention Mechanism and Syntactic Information[J].Computer Science,2019,46(5):221-227.
[24]ZHANG B,XIONG D,SU J,et al.Learning Better Discourse Representation for Implicit Discourse Relation Recognition via Attention Networks[J].Neurocomputing,2018,275:1241-1249.
[25]CHEN J,ZHANG Q,LIU P,et al.Implicit Discourse Relation Detection via a Deep Architecture with Gated Relevance Network[C]//Proceedings of ACL 2016.2016:1726-1735.
[26]LIU Y,LI S.Recognizing Implicit Discourse Relations via Repeated Reading:Neural Networks with Multi-Level Attention[C]//Proceedings of EMNLP 2016.2016:1224-1233.
[27]LEI W,WANG X,LIU M,et al.SWIM:A Simple Word Interaction Model for Implicit Discourse Relation Recognition[C]//Proceedings of IJCAI 2017.2017:4026-4032.
[28]GUO F,HE R,JIN D,et al.Implicit Discourse Relation Recognition Using Neural Tensor Network with Interactive Attention and Sparse Learning[C]//Proceedings of COLING 2018.2018:547-558.
[29]GUO F Y,HE R F,DANG J W.Implicit Discourse Relation Recognition Based on Context Interaction Perception and Pattern Selection[J].Chinese Journal of Computers,2020,43(5):901-915.
[30]MARCU D,ECHIHABI A.An Unsupervised Approach to Re-cognizing Discourse Relations[C]//Proceedings of ACL 2002.2002:368-375.
[31]SPORLEDER C,LASCARIDES A.Using Automatically La-belled Examples to Classify Rhetorical Relations:An Assessment[J].Natural Language Engineering,2008,14(3):369-416.
[32]WU C,SHI X,CHEN Y,et al.Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition[C]//Procee-dings of EMNLP 2016.2016:2306-2312.
[33]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]//Proceedings of EMNLP 2017.2017:1310-1319.
[34]WU C,SHI X,SU J,et al.Co-training for Implicit Discourse Relation Recognition Based on Manual and Distributed Features[J].Neural Processing Letters,2017,46(1):233-250.
[35]XU Y,HONG Y,RUAN H,et al.Using Active Learning to Expand Training Data for Implicit Discourse Relation Recognition[C]//Proceedings of EMNLP 2018.2018:725-731.
[36]BRAUD C,DENIS P.Learning Connective-based Word Representations for Implicit Discourse Relation Identification[C]//Proceedings of EMNLP 2016.2016:203-213.
[37]WU C,SU J,CHEN Y,et al.Boosting Implicit Discourse Relation Recognition with Connective-based Word Embeddings[J].Neurocomputing,2019,369:39-49.
[38]YIM J,JOO D,BAE J,et al.A Gift from Knowledge Distillation:Fast Optimization,Network Minimization and Transfer Learning[C]//Proceeding of CVPR 2017.2017:7130-7138.
[39]ZHANG Y,XIANG T,HOSPEDALES T M,et al.Deep Mutual Learning[C]//Proceeding of CVPR 2018.2018:4320-4328.
[40]LIU X,LIU K,LI X,et al.An Iterative Multi-Source MutualKnowledge Transfer Framework for Machine Reading Comprehension[C]//Proceedings of IJCAI 2020.2020:3794-3800.
[42]PARIKH A,TÄCKSTRÖM O,DAS D,et al.A Decomposable Attention Model for Natural Language Inference[C]//Procee-dings of EMNLP 2016.2016:2249-2255.
[43]SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.
[44]PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]//Proceedings of EMNLP 2014.2014:1532-1543.
[45]CAI D,ZHAO H.Pair-Aware Neural Sentence Modeling for Implicit Discourse Relation Classification[C]//Proceedings of Advances in Artificial Intelligence:From Theory to Practice 2017.2017:458-466.
[46]GUO F,HE R,DANG J,et al.Working Memory-Driven Neural Networks with a Novel Knowledge Enhancement Paradigm for Implicit Discourse Relation Recognition[C]//Proceeding of AAAI 2020.2020:10-18.
[47]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of NAACL 2018.2018:2227-2237.
[48]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[14] CHU Yu-chun, GONG Hang, Wang Xue-fang, LIU Pei-shun. Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 [J]. Computer Science, 2022, 49(6A): 337-344.
[15] ZHOU Zhi-hao, CHEN Lei, WU Xiang, QIU Dong-liang, LIANG Guang-sheng, ZENG Fan-qiao. SMOTE-SDSAE-SVM Based Vehicle CAN Bus Intrusion Detection Algorithm [J]. Computer Science, 2022, 49(6A): 562-570.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!