计算机科学 ›› 2020, Vol. 47 ›› Issue (3): 231-236.doi: 10.11896/jsjkx.190100108

• 人工智能 • 上一篇    下一篇

融入结构信息的指代消解

付健,孔芳   

  1. (苏州大学计算机科学与技术学院 江苏 苏州251006)
  • 收稿日期:2019-01-15 出版日期:2020-03-15 发布日期:2020-03-30
  • 通讯作者: 孔芳(kongfang@stu.suda.edu.cn)
  • 基金资助:
    国家自然科学基金(61876118);人工智能应急项目(61751206);国家重点研发计划子课题(2017YFB1002101)

Coreference Resolution Incorporating Structural Information

FU Jian,KONG Fang   

  1. (School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 251006, China)
  • Received:2019-01-15 Online:2020-03-15 Published:2020-03-30
  • About author:FU Jian,born in 1994,postgraduate student.His main research interest include coreference resolution and natural language processing. KONG Fang,born in 1977,doctor.Her main research interest include machine learning,natural language processing,and text analysis.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61876118), Artificial Intelligence Emergency Project (61751206) and National Key Research and Development Plan Sub-project (2017YFB1002101).

摘要: 随着深度学习的兴起与发展,越来越多的学者开始将深度学习技术应用于指代消解任务中。但现有的神经指代消解模型普遍只关注文本的线性特征,忽略了传统方法中已证明非常有效的结构信息的融入。以目前表现最佳的Lee等提出的神经网络模型为基础,借助成分句法树对上述问题进行了改进:1)提出了一种枚举句法树中以结点为短语的抽取策略,避免了暴力枚举策略所受到的长度限制与不符合句法规则的短语集噪音的引入;2)利用树的遍历得到结点序列,结合结点的高度与路径等特征,直接对成分句法树进行上下文表示并将其融入模型中,避免了只使用字、词序列而产生的结构信息缺失问题。在CoNLL 2012 Shared Task的数据集上对所提模型进行了一系列实验,实验结果显示,其中文指代消解的F1值达到了62.35,英文指代消解的F1值也达到了67.24,从而验证了所提结构信息融入策略能大大提升指代消解的性能。

关键词: 成分句法树, 高度特征, 结构信息, 嵌入, 指代消解

Abstract: With the rise and development of deep learning,more and more researchers begin to apply deep learning technology to coreference resolution.However,existing neural coreference resolution models only focus on the sequential information of text and ignore the integration of structural information which has been proved to be very useful in traditional methods.Based on the neural coreference model proposed by Lee et al.,which has the best performance at present,two measures to solve the problem mentioned above with the help of the constituency parse tree were proposed.Firstly,node enumeration was used to replace the original span extraction strategy.It avoids the restriction of span length and reduces the number of spans that don’t satisfy syntactic rules.Secondly,node sequences are obtained through tree traversal,and the features such as height and path are combined to generate the context representation of the constituency parse trees directly.It avoids the problem of missing structural information caused by the use of word and character sequences only.A lot of experiments were conducted on the dataset of CoNLL 2012 Shared Task,and the proposed model achieves 62.35 average F1 for Chinese and 67.24 average F1 for English,which show that the proposed structural information integration strategy can improve the performance of coreference resolution significantly.

Key words: Constituency parse tree, Coreference resolution, Embedding, Height features, Structural information

中图分类号: 

  • TP391
[1]HOBBS J R.Resolving pronoun references[J].Lingua,1978,44(4):311-338.
[2]LAPPIN S,LEASS H J.An algorithm for pronominal anaphora resolution[J].Computational linguistics,1994,20(4):535-561.
[3]MCCORD M C.Slot grammar[M]∥Natural Language and Logic. Berlin:Springer,1990:118-145.
[4]KONG F,ZHOU G D.Pronoun Resolution in English and Chinese Languages Based on Tree Kernel[J].Journal of Software,2012,23(5):1085-1099.
[5]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]∥Advances in Neural Information Processing Systems.Lake Tahoe:NIPS,2013:3111-3119.
[6]CLARK K,MANNING C D.Deep Reinforcement Learning for Mention-Ranking Coreference Models[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin:EMNLP,2016:2256-2262.
[7]PRADHAN S,MOSCHITTI A,XUE N,et al.CoNLL-2012 shared task:Modeling multilingual unrestricted coreference in OntoNotes[C]∥Joint Conference on EMNLP and CoNLL-Shared Task.Jeju Island:ACL,2012:1-40.
[8]WU J L,MA W Y.A deep learning framework for coreference resolution based on convolutional neural network[C]∥2017 IEEE 11th International Conference on Semantic Computing (ICSC).San Diego:IEEE,2017:61-64.
[9]LEE K,HE L,LEWIS M,et al.End-to-end Neural Coreference Resolution[C]∥Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Copenhagen:ACL,2017:188-197.
[10]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[11]LEE K,HE L,ZETTLEMOYER L.Higher-Order Coreference Resolution with Coarse-to-Fine Inference[C]∥Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:ACL,2018,2:687-692.
[12]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextuali- zed Word Representations[C]∥Proceedings of the 2018 Confe-rence of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:ACL,2018:2227-2237.
[13]LIANG D,XU W,ZHAO Y.Combining word-level and character-level representations for relation classification of informal text[C]∥Proceedings of the 2nd Workshop on Representation Learning for NLP.Vancouver:ACL,2017:43-47.
[14]ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[C]∥Advances in Neural Information Processing Systems.Montreal:NIPS,2015:649-657.
[15]LING W,DYER C,BLACK A W,et al.Finding Function in Form:Compositional Character Models for Open Vocabulary Word Representation[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:ACL,2015:1520-1530.
[16]VILAIN M,BURGER J,ABERDEEN J,et al.A model-theore- tic coreference scoring scheme[C]∥Proceedings of the 6th Conference on Message Understanding.Columbia:ACL,1995:45-52.
[17]BAGGA A,BALDWIN B.Algorithms for scoring coreference chains[C]∥The First International Conference on Language Resources and Evaluation Workshop on Linguistics Corefe-rence.Granada:LREC,1998,1:563-566.
[18]LUO X.On coreference resolution performance metrics[C]∥ Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing.Vancouver:ACL,2005:25-32.
[19]NAIR V,HINTON G E.Rectified linear units improve restric- ted boltzmann machines[C]∥Proceedings of the 27th International Conference on International Conference on Machine Learning.Haifa:Omni press,2010:807-814.
[20]SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.
[21]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[22]AL-RFOU R,PEROZZI B,SKIENA S.Polyglot:Distributed Word Representations for Multilingual NLP [C]∥Proceedings of the Seventeenth Conference on Computational Natural Language Learning.Sofia:ACL,2013:183-192.
[23]PENNINGTON J,SOCHER R,MANNING C.Glove:Global vectors for word representation[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).Doha:ACL,2014:1532-1543.
[24]CLARK K,MANNING C D.Improving Coreference Resolution by Learning Entity-Level Distributed Representations[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin:ACL,2016:643-653.
[25]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[1] 帅剑波, 王金策, 黄飞虎, 彭舰.
基于神经架构搜索的点击率预测模型
Click-Through Rate Prediction Model Based on Neural Architecture Search
计算机科学, 2022, 49(7): 10-17. https://doi.org/10.11896/jsjkx.210600009
[2] 靳利贞, 李庆忠.
基于接缝一致性准则的结构纹理图像快速合成算法
Fast Structural Texture Image Synthesis Algorithm Based on Seam ConsistencyCriterion
计算机科学, 2022, 49(6): 262-268. https://doi.org/10.11896/jsjkx.210400039
[3] 钟桂凤, 庞雄文, 隋栋.
基于Word2Vec和改进注意力机制AlexNet-2的文本分类方法
Text Classification Method Based on Word2Vec and AlexNet-2 with Improved AttentionMechanism
计算机科学, 2022, 49(4): 288-293. https://doi.org/10.11896/jsjkx.211100016
[4] 李勇, 吴京鹏, 张钟颖, 张强.
融合快速注意力机制的节点无特征网络链路预测算法
Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism
计算机科学, 2022, 49(4): 43-48. https://doi.org/10.11896/jsjkx.210800276
[5] 杨辉, 陶力宏, 朱建勇, 聂飞平.
基于锚点的快速无监督图嵌入
Fast Unsupervised Graph Embedding Based on Anchors
计算机科学, 2022, 49(4): 116-123. https://doi.org/10.11896/jsjkx.210200098
[6] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[7] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[8] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
[9] 李玉强, 张伟江, 黄瑜, 李琳, 刘爱华.
基于高斯分布的改进词嵌入主题情感模型
Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution
计算机科学, 2022, 49(2): 256-264. https://doi.org/10.11896/jsjkx.201200082
[10] 李昭奇, 黎塔.
基于wav2vec预训练的样例关键词识别
Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining
计算机科学, 2022, 49(1): 59-64. https://doi.org/10.11896/jsjkx.210900007
[11] 郑苏苏, 关东海, 袁伟伟.
融合不完整多视图的异质信息网络嵌入方法
Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion
计算机科学, 2021, 48(9): 68-76. https://doi.org/10.11896/jsjkx.210500203
[12] 孙圣姿, 郭炳晖, 杨小博.
用于多模态语义分析的嵌入共识自动编码器
Embedding Consensus Autoencoder for Cross-modal Semantic Analysis
计算机科学, 2021, 48(7): 93-98. https://doi.org/10.11896/jsjkx.200600003
[13] 李鹏, 刘力军, 黄永东.
基于地标表示的联合谱嵌入和谱旋转的谱聚类算法
Landmark-based Spectral Clustering by Joint Spectral Embedding and Spectral Rotation
计算机科学, 2021, 48(6A): 220-225. https://doi.org/10.11896/jsjkx.210100167
[14] 瞿伟, 余飞鸿.
基于多核处理器的非对称嵌入式系统研究综述
Survey of Research on Asymmetric Embedded System Based on Multi-core Processor
计算机科学, 2021, 48(6A): 538-542. https://doi.org/10.11896/jsjkx.200900204
[15] 卢永超, 王斌翊, 胡江峰, 穆阳, 任俊龙.
综合电子时间同步技术研究
Research on Integrated Electronic Time Synchronization Technology
计算机科学, 2021, 48(6A): 629-632. https://doi.org/10.11896/jsjkx.201100114
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!