计算机科学 ›› 2024, Vol. 51 ›› Issue (8): 297-303.doi: 10.11896/jsjkx.230600231

• 人工智能 • 上一篇    下一篇

基于RoBERTa和加权图卷积网络的中文地质实体关系抽取

张鲁, 段友祥, 刘娟, 陆誉翕   

  1. 中国石油大学(华东)计算机科学与技术学院 山东 青岛 266580
  • 收稿日期:2023-06-29 修回日期:2023-11-12 出版日期:2024-08-15 发布日期:2024-08-13
  • 通讯作者: 段友祥(yxduan@upc.edu.cn)
  • 作者简介:(s21070043@s.upc.edu.cn)
  • 基金资助:
    中央高校基本科研业务费专项资金(20CX05017A);中石油重大科技项目(ZD2019-183-006)

Chinese Geological Entity Relation Extraction Based on RoBERTa and Weighted Graph Convolutional Networks

ZHANG Lu, DUAN Youxiang, LIU Juan, LU Yuxi   

  1. College of Computer Science and Technology,China University of Petroleum(East China),Qingdao,Shandong 266580,China
  • Received:2023-06-29 Revised:2023-11-12 Online:2024-08-15 Published:2024-08-13
  • About author:ZHANG Lu,born in 1999,postgra-duate,is a member of CCF(No.I1760G).Her main research interests include knowledge graph,relation extraction,and so on.
    DUAN Youxiang,born in 1964,Ph.D,professor,is a member of CCF(No.05290S).His main research interests include network and service computing,the application of computer technology in oil and gas field,and so on.
  • Supported by:
    Fundamental Research Funds for the Central Universities of Ministry of Education of China(20CX05017A) and Major Scientific and Technological Projects of CNPC(ZD2019-183-006).

摘要: 知识是大数据和人工智能的基石,知识图谱的可解释性和可扩展性等优势使其成为智能系统的重要技术。智能决策在各个领域都有迫切的应用需求,为知识图谱提供基于数据分析和推理的决策支持和应用场景,但领域场景复杂、数据多源、知识维度广,因此知识图谱的构建和应用都面临着很多挑战。针对地质领域知识图谱构建过程中领域知识模式完备性差的问题,以及现有实体关系抽取方法在处理非欧氏数据时存在的不足,提出了一种基于图结构的实体关系抽取模型RoGCN-ATT。该模型使用RoBERTa-wwm-ext-large中文预训练模型作为序列编码器,结合BiLSTM获取更丰富的语义信息,使用加权图卷积网络结合注意力机制获取结构依赖信息,以增强模型对关系三元组的抽取性能。在地质数据集上F1值达78.56%,与其他模型的对比实验表明,RoGCN-ATT有效提升了实体关系抽取性能,为地质知识图谱的构建和应用提供了有力的支持。

关键词: 实体关系抽取, 图卷积网络, 依存句法分析, 注意力机制, 地质领域

Abstract: Knowledge is the cornerstone of big data and artificial intelligence.Knowledge graphs offer interpretability and sca-lability advantages,making them crucial in intelligent systems.Intelligent decision has urgent application demand in various fields,providing decision support and application scenarios for knowledge graphs based on data analysis and reasoning.However,constructing and applying knowledge graphs face challenges due to complex domain scenarios,multi-source data,and extensive knowledge dimensions.To address the problem of incomplete domain knowledge patterns during geological domain knowledge graph construction and the limitations of existing entity relationship extraction methods in dealing with non-Euclidean data,a graph structure-based entity relationship extraction model RoGCN-ATT is proposed.This model utilizes RoBERTa-wwm-ext-large,a Chinese pre-trained model,as the sequence encoder combined with BiLSTM to capture richer semantic information.It also employs weighted graph convolutional networks along with attention mechanisms to capture structural dependency information and enhance the extraction performance of relation triplets.Experimental results show that the F1 value reaches 78.56% on the geological dataset.Compared with other models,RoGCN-ATT effectively improves the entity-relationship extraction performance and provides strong support for the construction and application of geological knowledge maps.

Key words: Entity relation extraction, Graph convolutional networks, Dependency parsing, Attention mechanism, Geology domain

中图分类号: 

  • TP391
[1]LI C,LIU D,ZHOU D,et al.Application and Prospect of Artificial Intelligence in the Field of Geology[J].Bulletin of Mineralogy,Petrology and Geochemistry,2022,41(3):668-677.
[2]MA R X.Research on Key Technologies of Knowledge Graph Construction in Chinese Medical Field[D].Hangzhou:Zhejiang University,2023.
[3]LI X,GAO R,QIN H,et al.EINE:Relation Classification by Enhancing the Impact of Non-Entity words[C]//Proceedings of the 2022 5th International Conference on Machine Learning and Natural Language Processing.2022:68-73.
[4]GUO Q,SUN Y,LIU G,et al.Constructing Chinese historical literature knowledge graph based on BERT[C]//Web Information Systems and Applications:18th International Conference,WISA 2021,Kaifeng,China,September 24-26,2021,Proceedings 18.Springer International Publishing,2021:323-334.
[5]HUANG S B,SUN X W,LI R S.Relation Classification Me-thod Based on Cross-sentence Contextual Information for Neural Network[J].Computer Science,2022,49(S1):119-124.
[6]EBERTS M,ULGES A.An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:3650-3660.
[7]LI Z,FU L.A Relation-Aware Span-Level Transformer Net-work for Joint Entity and Relation Extraction[C]//2022 International Joint Conference on Neural Networks(IJCNN).IEEE,2022:1-8.
[8]LI H,HOU S L,TONG Q,et al.Entity Relation Extraction Method in Weapon Field Based on DCNN and GLU[J].Computer Science,2023,50(6A):220200112-7.
[9]YU X S,LI L Y,ZHOU J L,et al.AM FRel:A method for joint extraction of entity relations in Chinese electronic medical records[J].Journal of Chongqing University of Technology(Natural Science),2024,38(2):189-197.
[10]ZHANG J L,ZHANG Y F,WANG M Q,et al.Joint extraction of Chinese entity relations based on graph convolutional neural network[J].Computer Engineering,2021,47(12):103-111.
[11]CUI Y,CHE W,LIU T,et al.Revisiting PreTrained Models for Chinese Natural Language Processing[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.2020:657-668.
[12]ZHANG S,ZHENG D,HU X,et al.Bidirectional long short-term memory networks for relation classification[C]//Procee-dings of the 29th Pacific Asia Conference on Language,Informa-tion and Computation.2015:73-78.
[13]KUMAR S.A survey of deep learning methods for relation extraction[J].arXiv:1705.03645,2017.
[14]ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of COLING 2014,the 25th International Conference on Computational Linguistics:Technical Papers.2014:2335-2344.
[15]TAKASE S,OKAZAKI N,INUI K.Modeling semantic compositionality of relational patterns[J].Engineering Applications of Artificial Intelligence,2016,50:256-264.
[16]NASAR Z,JAFFRY S W,MALIK M K.Named entit-y recognition and relation extraction:State-of-the-art[J].ACM Computing Surveys(CSUR),2021,54(1):1-39.
[17]LEI X,SONG W,FAN R,et al.Semi-supervised geological disa-sters named entity recognition using few labeled data[J].Geo-Informatica,2023,27:263-288.
[18]FAN R,WANG L,YAN J,et al.Deep learning-based named entity recognition and knowledge graph construction for geological hazards[J].ISPRS International Journal of Geo-Information,2019,9(1):1-22.
[19]LUO X,ZHOU W,WANG W,et al.Attention-based relation extraction with bidirectional gated recurrent unit and highway network in the analysis of geological data[J].IEEE Access,2017,6:5705-5715.
[20]HUANG X S,ZHU Y Q,FU L J,et al.Research on a geological entity relation extraction model for gold mine based on BERT[J].Journal of Geomechanics,2021,27(3):391-399.
[21]CHEN Z L,YUAN F,LI X H,et al.Based on BERT-BiLSTM-CRF model the named entity and relation joint extration of Chinese lithological description corpus[J].Geological Review,2022,68(2):742-750.
[22]WANG Z G,WEN H Y,LU Q,et al.Joint extraction of open entity relation in geological field[J].Computer Engineering and Design,2021,42(4):996-1005.
[23]WU X Y,DUAN Y X,CHANG L J,et al.Research on entity and relation joint extraction for geological domain[J].Computer Engineering,2023,49(3):121-127.
[24]BUNESCU R,MOONEY R.A shortest path dependency kernel for relation extraction[C]//Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing.2005:724-731.
[25]CAI R,ZHANG X,WANG H.Bidirectional recurrent convolutional neural network for relation classification[C]//Procee-dings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2016:756-765.
[26]HENDRICKX I,KIM S N,KOZAREVA Z,et al.SemEval-2010 Task 8:Multi-Way Classification of Semantic Relations between Pairs of Nominals[C]//Proceedings of the 5th International Workshop on Semantic Evaluation.2010:33-38.
[27]ZHANG Y,QI P,MANNING C D.Graph Convolution overPruned Dependency Trees Improves Relation Extraction[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:2205-2215.
[28]YU B,MENGGE X,ZHANG Z,et al.Learning to prune dependency trees with rethinking for neural relation extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:3842-3852.
[29]HONG Y,LIU Y,YANG S,et al.Improving graph convolu-tional networks based on relation-aware attention for end-to-end relation extraction[J].IEEE Access,2020,8:51315-51323.
[30]TIAN Y,CHEN G,SONG Y,et al.Dependency-driven relation extraction with attentive graph convolutional networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:4458-4471.
[31]ZHOU H,XU Y,YAO W,et al.Global context enhanced graph convolutional networks for document-level relation extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:5259-5270.
[32]DUAN J Y,YANG X,WANG H,et al.Document-level Relation Extraction of Graph Attention Convolutional Network Based on Inter-sentence Information[J].Computer Science,2023,50(S1):220800189-6.
[33]ZHAO K,XU H,CHENG Y,et al.Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction[J].Knowledge-Based Systems,2021,219:106888.
[34]ZHOU L,WANG T,QU H,et al.A weighted GCN with logical adjacency matrix for relation extraction[M]//ECAI 2020.IOS Press,2020:2314-2321.
[35]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.
[36]LI S,ZHAO Z,HU R,et al.Analogical Reasoning on Chinese Morphological and Semantic Relations[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2018:138-143.
[37]HE H,CHOI J D.The Stem Cell Hypothesis:Dilemma behind Multi-Task Learning with Transformer Encoders[C]//Procee-dings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:5555-5577.
[38]LIN Y,SHEN S,LIU Z,et al.Neural relation extraction with selective attention over instances[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguis-tics(Volume 1:Long Papers).2016:2124-2133.
[39]MANDYA A,BOLLEGALA D,COENEN F.Graph Convolu-tion over Multiple Dependency Subgraphs for Relation Extraction[C]//COLING.International Committee on Computational Linguistics.2020:6424-6435.
[40]QI P,ZHANG Y,ZHANG Y,et al.Stanza:A Python NaturalLanguage Processing Toolkit for Many Human Languages[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics:System Demonstrations.2020:101-108.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!