计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 252-257.doi: 10.11896/jsjkx.180901780

• 人工智能 • 上一篇    下一篇

基于多层次注意力机制的远程监督关系抽取模型

李浩, 刘永坚, 解庆, 唐伶俐   

  1. (武汉理工大学计算机科学与技术学院 武汉430070)
    (国家新闻出版署出版融合发展(武汉)重点实验室 武汉430070)
  • 收稿日期:2018-09-20 修回日期:2018-12-19 出版日期:2019-10-15 发布日期:2019-10-21
  • 通讯作者: 解庆(1986-),男,博士,副教授,CCF会员,主要研究方向为流数据挖掘、知识服务和推荐系统,E-mail:felixxq@whut.edu.cn。
  • 作者简介:李浩(1992-),男,硕士生,主要研究方向为知识服务和自然语言处理;刘永坚(1962-),男,博士,教授,主要研究方向为文化资源数字化、数字出版和数据传播;唐伶俐(1989-),女,博士,讲师,主要研究方向为数字传播和版权保护。
  • 基金资助:
    本文受国家自然科学基金(61602353),湖北省自然科学基金(2017CFB505)资助。

Distant Supervision Relation Extraction Model Based on Multi-level Attention Mechanism

LI Hao, LIU Yong-jian, XIE Qing, TANG Ling-li   

  1. (School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China)
    (State Press and Publication Administration Publishing Fusion Development Key Laboratory,Wuhan 430070,China)
  • Received:2018-09-20 Revised:2018-12-19 Online:2019-10-15 Published:2019-10-21

摘要: 实体关系抽取作为信息抽取的主要任务之一,其目的在于确定无结构文本中两个实体的关系类别。目前准确率较高的有监督方法由于需要大量的人工标注语料而受到了限制,而远程监督方法则通过知识库与文本集进行启发式对齐来获取大量关系三元组,这是解决大规模关系抽取任务的主要途径。针对目前远程监督关系抽取的研究未能充分利用句子上下文词语的高层语义,以及未考虑关系之间的依赖包含关系的问题,文中提出了一种基于多层次注意力机制的远程监督关系抽取模型。该模型首先通过双向GRU(Gate Recurrent Unit)神经网络对句子词向量进行编码来获取句子高维语义;其次通过引入词语层注意力来计算两个实体与上下文词语的相关程度,从而充分捕捉句子中实体上下文的语义信息;然后在多个实例上构建句子层的注意力来减少标签错误标注的问题;最后通过关系层的注意力自动学习不同关系之间的依赖包含关系。在FreeBase+NYT公共数据集上的实验结果表明,在双向GRU模型的基础上引入词语层、句子层和关系层注意力机制对提高远程监督关系抽取的效果都起到了促进作用;将三层注意力机制进行融合得到的多层次注意力机制关系抽取模型的准确率和召回率相较于现有的主流方法提高了4%左右,更好地实现了关系抽取,从而为进一步构建知识图谱、智能问答等应用奠定了理论基础。

关键词: 词向量, 关系抽取, 双向GRU, 远程监督, 注意力机制

Abstract: As one of the main tasks of information extraction,entity relation extraction aims at determining the relationship category of two entities in unstructured text.At present,the supervised method with high accuracy is limited by the need for a large number of manual tagging corpus.The distant supervision method obtains a large number of relational triples by heuristic alignment between knowledge base and text set,which is the main way to solve the large-scale relational extraction task.In order to solve the problems that the high-dimensional semantics of words in sentence context are not fully utilized and the dependency-inclusion relationship between relationships is not considered in the current research on distant supervision relation extraction,this paper proposed a multi-level attention mechanism model for distant supervision relation extraction.In this model,the high-level semantics of sentences are obtained by utilizing the bidirectional GRU(Gate Recurrent Unit) neural network to code the sentence word vectors.Then,the word-level attention is introduced to calculate the degree of correlation between two entities and the context words,thus capturing the semantic information of the entity context in sentences adequately.Next,the sentence-level attention is constructed on multiple instances to reduce the tag error annotation problem.Finally,the dependency-inclusion relationship between different relationships is automatically learned by the relation-level attention.The experimental results on FreeBase+NYT public dataset show that the introduction of word-level,sentence-level and relation-level attention mechanisms on the basis of bidirectional GRU model can improve the effect of distant supervision relation extraction.Compared with the existing mainstream methods,the multi-level attention mechanism relation extraction model obtained by integrating three levels attention mechanisms improves the accuracy and recall rate by about 4%,which achieves better relation extraction effect,thus providing a theoretical foundation for further constructing the knowledge graph and intelligent question answering applications.

Key words: Attention mechanism, Bidirectional GRU, Distant supervision, Relation extraction, Word embedding

中图分类号: 

  • TP391
[1]LIU S Y,LI B C,GUO Z Z,et al.Review of Entity Relation Extraction[J].Journal of Information Engineering University,2016,17(5):541-547.(in Chinese)
刘绍毓,李弼程,郭志刚,等.实体关系抽取研究综述[J].信息工程大学学报,2016,17(5):541-547.
[2]MINTZ M,STEVEN B,RION S,et al.Distant super vision for relation extraction without labeled data [C]//Proceedings of Joint Conference of the Meeting of the ACL .Stroudsburg:Association for Computational Linguistics,2009:1003-1011.
[3]GAN L X,WAN C X,LIU D X,et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J].Journal of Computer Research and Development,2016,53(2):284-302.(in Chinese)
甘丽新,万常选,刘德喜,等.基于句法语义特征的中文实体关系抽取[J].计算机研究与发展,2016,53(2):284-302.
[4]CHOI S P,LEE S,JUNG H,et al.An intensive case study on kernel-based relation extraction[J].Multimedia Tools & Applications,2014,71(2):741-767.
[5]ZENG D,LIU K,LAI S,et al.Relation Classification via Convolutional Deep Neural Network[C]//Proceedings of the 25th International Conference on Computational Linguistics.2014:2335-2344.
[6]LI F L,KE J.Research Progress of Entity Relation Extraction Base on Deep Learning Framework[J].Information Science,2018,V36(3):169-176.(in Chinese)
李枫林,柯佳.基于深度学习框架的实体关系抽取研究进展[J].情报科学,2018,V36(3):169-176.
[7]HOFFMANN R,ZHANG C,LING X,et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:Association for Computational Linguistics,2011:541-550.
[8]SURDEANU M,TIBSHIRANI J,NALLAPATI R,et al.Multi-instance multi-label learning for relation extraction[C]//Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natura Language Learning.2012:455-465.
[9]BENJAMIN R,DIETRICH K.Combining Generative and Dis-criminative Model Scores for Distant Supervision[C]//Procee-dings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:24-29.
[10]ZENG D,LIU K,CHEN Y,et al.Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.2015:1753-1762.
[11]JIANG X,WANG Q,LI P,et al.Relation Extraction with Multi-instance Multi-label Convolutional Neural Networks[C]//26th International Conference on Computational Linguistics.2016:1471-1480.
[12]LEI K,CHEN D,LI Y,et al.Cooperative Denoising for Distantly Supervised Relation Extraction [C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:426-436.
[13]BAHDANAU D,CHO K,BENGIO Y,et al.Neural Machine Translation by Jointly Learning to Align and Translate[J].International conference on learning representations,2015,12(7):366-381.
[14]LIN Y,SHEN S,LIU Z,et al.Neural relation extraction with selective attention over instances[C]//Proceedings of Meeting of the Association for Computational Linguistics.2016:2124-2133.
[15]JI G,LIU K,HE S,et al.Distant Supervision for Relation Extraction with Sentence-level Attention and Entity Descriptions[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.2017:3060-3066.
[16]HE D,ZHANG H,HAO W,et al.A Customized Attention-Based Long Short-Term Memory Network for Distant Supervised Relation Extraction[J].Neural Computation,2017,27(7):1964-1985.
[17]PENNINGTON J,SOCHER R,MANNING C.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.2014:1532-1543.
[18]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[OL].https://arxiv.org/abs/1301.3781.
[19]CHUNG J,GULCEHRE C,CHO K,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J].arXiv:1412.3555,2014.
[20]HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-223.
[21]RIEDEL S,YAO L,MCCALLUM A,et al.Modeling relations and their mentions without labeled text[C]//European Conference on Machine Learning.2010:148-163.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[12] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[13] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[14] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[15] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!