计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 184-192.doi: 10.11896/jsjkx.220700082

• 人工智能 • 上一篇    下一篇

基于增强序列标注策略的单阶段联合实体关系抽取方法

朱秀宝, 周刚, 陈静, 卢记仓, 向怡馨   

  1. 数学工程与先进计算国家重点实验室 郑州 450001
  • 收稿日期:2022-07-08 修回日期:2022-12-01 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 周刚(gzhougzhou@126.com)
  • 作者简介:(freeline55@163.com)
  • 基金资助:
    河南省科技攻关项目(222102210081)

Single-stage Joint Entity and Relation Extraction Method Based on Enhanced Sequence Annotation Strategy

ZHU Xiubao, ZHOU Gang, CHEN Jing, LU Jicang, XIANG Yixin   

  1. State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China
  • Received:2022-07-08 Revised:2022-12-01 Online:2023-08-15 Published:2023-08-02
  • About author:ZHU Xiubao,born in 1995,master candidate.His main research interests include knowledge graph and data mi-ning.
    ZHOU Gang,born in 1974,Ph.D,professor.His main research interests include big data analysis,knowledge graph and massive data processing.
  • Supported by:
    Science and Technology Project of Henan Province(222102210081).

摘要: 从非结构化文本中抽取实体和关系是自动构建知识库的基础工作。现有的工作主要采用联合学习方法来解决嵌套实体、重叠关系、冗余计算和曝光偏差等问题,但单个模型仅在部分问题上表现出色,尚无模型可以同时解决上述问题。因此,提出了一种基于增强序列标注策略的单阶段联合实体关系抽取方法(A Token With Multi-labels Entity and Relation Extraction, ATMREL)。首先,设计了一种增强序列标注策略,将文本中的每个单词标记为多个标签,标签包含每个单词在实体中的位置、关系类型和实体位置信息。然后,将每个单词的标签预测转化为多标签分类任务,同时将联合实体关系抽取转化为序列标注任务。最后,为增强实体对之间的依赖关系,引入实体相关矩阵,用于对抽取结果进行剪枝,以提升模型抽取效果。实验结果表明,与CasRel和TPLinker模型相比,ATMREL模型在NYT和WebNLG数据集上的参数量减少了3.1×106~5.4×106,平均推理速度提升了2~4.2倍,F1值提升了0.5%~2.1%。

关键词: 联合实体关系抽取, 序列标注, 组合标签, 相关矩阵

Abstract: Extracting entities and relations from unstructured text is the fundamental task of automatically constructing know-ledge bases.Existing works mainly adopt joint learning to solve the problems of nested entities,overlapping relations,redundant computation,or exposure bias,but a single model only performs well on some issues,and no model can solve the above problems simultaneously.Therefore,a single-stage joint entity and relation extraction method based on an enhanced sequence annotation strategy called ATMREL is proposed.First,an enhanced sequence annotation strategy is designed to tag each word in the text with multiple labels,and the labels contain information about the position of each word in the entity,the relation type and the entity location.Second,the labels prediction of each word is transformed into a multi-label classification task,while the joint entity and relation extraction is transformed into a sequence annotation task.Finally,to enhance the dependencies between entity pairs,an entity correlation matrix is introduced for pruning the extraction results to improve the model extraction effect.Experimental results show that ATMREL model reduces the parameter volume by 3.1×106~5.4×106,improves the average inference speed by 2~4.2 times,and improves the F1 value by 0.5%~2.1% compared with the CasRel and TPLinker models on the NYT and WebNLG datasets.

Key words: Joint entity and relation extraction, Sequence annotation, Combined labels, Correlation matrix

中图分类号: 

  • TP391
[1]LIU Q,LI Y,DUAN H,et al.Knowledge graph construction techniques [J].Journal of Computer Research and Development,2016,53(3):582-600.
[2]ZELENKO D,AONE C,RICHARDELLA A.Kernel methodsfor relation extraction[C]//Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing.Philadelphia:ACL,2002:71-78.
[3]CHAN Y S,ROTH D.Exploiting syntactico-semantic structures for relation extraction[C]//The 49th annual Meeting of the Association for Computational Linguistics.Portland:ACL,2011:551-560.
[4]GORMLEY M R,YU M,DREDZE M.Improved relation extraction with feature-rich compositional embedding models[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:ACL,2015:1774-1784.
[5]MIWA M,BANSAL M.End-to-end relation extraction usinglstms on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin:ACL,2016:1105-1116.
[6]YU X F,LAM W.Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C]//International Conference on Computational Linguistics.Beijing:Chinese Information Processing Society of China,2010:1399-1407.
[7]LI Q,JI H.Incremental joint extraction of entity mentions and relations[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:ACL,2014:402-412.
[8]MIWA M,SASAKI Y.Modeling joint entity and relation extraction with table representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.Doha:ACL,2014:1858-1869.
[9]REN X,WU Z Q,HE W Q,et al.Cotype:joint extraction of typed entities and relations with knowledge bases[C]//Procee-dings of the 26th International Conference on World Wide Web.Perth:ACM,2017:1015-1024.
[10]ZHENG S C,WANG F,BAO H Y,et al.Joint extraction of entities and relations based on a novel tagging scheme[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver:ACL,2017:1227-1236.
[11]ZENG X R,ZENG D J,HE S Z,et al.Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne:ACL,2018:506-514.
[12]SUI D B,CHEN Y B,LIU K,et al.Joint entity and relation extraction with set prediction networks [J].arXiv:2011.01675,2020
[13]FU T J,LI P H,MA W Y,et al.GraphRel:modeling text as relational graphs for joint entity and relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence:ACL,2019:1409-1418.
[14]YU B W,ZHANG Z Y,SHU X B,et al.Joint extraction of entities and relations based on a novel decomposition strategy[C]//ECAI 2020-24th European Conference on Artificial Intelligence.Santiago de Compostela:IOS Press,2020:2282-2289.
[15]ZENG X R,HE S Z,ZENG D J,et al.Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Hong Kong:ACL,2019:367-377.
[16]YU K Q,HUANG F,WU Q,et al.Joint Extraction Method for Chinese Entity Relationship Based on Bidirectional Semantics[J].Computer Engineering,2023,49(1):92-99,112.
[17]WANG Y J,SUN C Z,WU Y B,et al.Unire:a unified labelspace for entity relation extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.ACL,2021:220-231.
[18]YAN Z H,ZHANG C,FU J L,et al.A partition filter network for joint entity and relation extraction[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Punta Cana:ACL,2021:185-197.
[19]JI B,YU J,LI S S,et al.Span-based joint entity and relation extraction with attention-based span-specific and contextual semantic representations[C]//Proceedings of the 28th International Conference on Computational Linguistics.Barcelona:International Committee on Computational Linguistics,2020:88-99.
[20]BEKOULIS G,DELEU J,DEMEESTER T,et al.Joint entity recognition and relation extraction as a multi-head selection problem [J].Expert Systems with Applications,2018,114:34-45.
[21]WEI Z P,SU J L,WANG Y,et al.A novel cascade binary tagging framework for relational triple extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computa-tional Linguistics.ACL,2020:1476-1488.
[22]ZHENG H Y,WEN R,CHEN X,et al.PRGC:potential relation and global correspondence based joint relational triple extraction[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.ACL,2021:6225-6235.
[23]MA L B,REN H M,ZHANG X L.Effective cascade dual-deco-der model for joint entity and relation extraction [J].arXiv:2106.14163,2021.
[24]WANG Y C,YU B,ZHANG Y Y,et al.TPLinker:single-stage joint extraction of entities and relations through token pair lin-king[C]//Proceedings of the 28th International Conference on Computational Linguistics.Barcelona:International Committee on Computational Linguistics,2020:1572-1582.
[25]SHANG Y M,HUANG H Y,MAO X L.Onerel:joint entityand relation extraction with one module in one step [J].arXiv:2203.05412,2022.
[26]WANG J,SHOU L D,CHEN K,et al.Pyramid:a layered model for nested named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:5918-5928.
[27]HUANG H Y,SHANG Y M,SUN X,et al.Three birds,onestone:a novel translation based framework for joint entity and relation extraction [J].Knowledge-Based Systems,2022,236:107677.
[28]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space [J].arXiv:1301.3781,2013.
[29]PENNINGTON J,SOCHER R,MANNING C D.Glove:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.Doha:ACL,2014:1532-1543.
[30]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of tricksfor efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.Valencia:ACL,2017:427-431.
[31]DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.Minneapolis:ACL,2019:4171-4186.
[32]RIEDEL S,YAO L M,MCCALLUM A.Modeling relations and their mentions without labeled text[C]//Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Barcelona:Springer,2010:148-163.
[33]GARDENT C,SHIMORINA A,NARAYAN S S,et al.Creating training corpora for nlg micro-planners[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver:ACL,2017:179-188.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!