计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 131-138.doi: 10.11896/jsjkx.191000161

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于Bootstrapping的水利空间关系词提取

相颖, 冯钧, 夏珮珮, 陆佳民   

  1. 河海大学计算机与信息学院 南京 211100
  • 收稿日期:2019-10-24 修回日期:2020-03-20 出版日期:2020-12-15 发布日期:2020-12-17
  • 通讯作者: 冯钧(fengjun@hhu.edu.cn)
  • 作者简介:1290496228@qq.com
  • 基金资助:
    国家重点研发计划(2018YFC0407901);国家自然科学青年基金项目(61602151);江苏高校文化创意协同创新中心资助项目(XYN1702)

Extraction of Water Conservancy Spatial Relationship Words Based on Bootstrapping

XIANG Ying, FENG Jun, XIA Pei-pei, LU Jia-min   

  1. College of Computer and Information Hohai University Nanjing 211100,China
  • Received:2019-10-24 Revised:2020-03-20 Online:2020-12-15 Published:2020-12-17
  • About author:XIANG Ying,born in 1995postgra-duateis a member of China Computer Federation.Her main research interests include relation extraction and so on.
    FENG Jun,born in 1969Ph.DprofessorPh.D supervisoris a member of China Computer Federation.Her main research interests include spatiotemporal data managementintelligent data processingdata mining and water conservancy informatization.
  • Supported by:
    National Key R&D Program of China(2018YFC0407901),Young Scientists Fund of the National Natural Science Foundation of China(61602151) and Jiangsu Collaborative Innovation Center for Cultural Creativity(XYN1702).

摘要: 目前在利用水利领域数据库构建知识图谱的过程中发现水利空间关系词的提取存在以下问题:数据库中水利对象空间关系词较少难以满足查询需要;水利对象间的关系类型复杂依靠人工构建太过费力.为了解决上述问题文中首先从专业性强的高质量水利公文文本中提取空间关系词形成种子集;然后通过外部词典进行空间关系词的扩展并结合语料提取面向水利空间关系词的句法模式;最后通过泛化后的句法模式对大规模水利文本数据进行空间关系词提取生成空间关系元组再将其作为种子集重复上述步骤.该方法使用少量的人工操作便可从语料中获得大量空间语义句法模式以及空间关系元组逐步扩展构建并最终形成水利空间关系词词典成为扩充水利对象知识图谱、提升智能检索的准确率的重要支撑.

关键词: 关系抽取, 空间关系, 水利领域, 知识图谱

Abstract: At presentthe following problems are found in the extraction of water conservancy spatial relational words in the process of using water conservancy domain database to construct knowledge map.Firstthere are few water conservancy object spatial relational words in the databasewhich is difficult to meet the needs of query.Secondthe relationship between water conservancy objects is complex and it is too laborious to rely on manual construction.In order to solve the above problemsfirstlythis paper extracts spatial relation words from professional high-quality water conservancy official documents to form seed sets.Thenit expands spatial relationship words through external dictionariesand combines corpus to extract water-related spatial relationship words Syntactic pattern.Finallythrough the generalized syntactic patternspatial relation words are extracted from large-scale water conservancy text dataspatial relationship triples are generatedand then used as seed sets.Repeating the above steps can gradually expand and construct water resources.This method can obtain a large number of spatial semantic syntactic patterns and spatial relationship tuples from the corpus with a small amount of manual operationsgradually expand the construction and eventually form a dictionary of water conservancy spatial relationship words.The word dictionary plays an important role in expanding the knowledge map of water conservancy objects and improving the accuracy of intelligent retrieval.

Key words: Knowledge graph, Relationship extraction, Spatial relationship, Water conservancy field

中图分类号: 

  • TP391.1
[1] CHENG J G,FENG J,YANG P,et al.Research on key techno-logies of water resources data directory service[J].Water Resources Informationization,2014(6):18-21.
[2] FENG J,TANG Z X,ZHU Y L,et al.Study on metadata definition of water resources information catalog service[J].Water Resources Informationization,2011(S1):19-22.
[3] ZHAO J,LIU K,ZHOU G Y,et al.Open Text Information Extraction[J].Journal of Chinese Information Processing,2011,25(6):98-110.
[4] LIU Y.Construction of Jilin Regional Knowledge Map Based on Geographic Ontology[D].Beijing:Beijing Jiaotong University,2017.
[5] HU C X,FU Y Q,ZHONG M Y.Extension of Semantic Query Based on Domain Ontology[J].Journal of Computer Systems,2012,21(7):83-89.
[6] JURAFSKY D,MARTINJ H.Speech and Language Processing[OL].http://web.stanford.edu/~jurafsky/slp3/.
[7] SCHUTZ A,BUITELAAR P.RelExt:a tool for relation extraction from text in ontology extension[C]//International Confe-rence on the Semantic Web.2005.
[8] RINK B,HARABAGIU S.Utd:Classifying semantic relations by combining lexical and semantic resources[C]//Proceedings of the 5th International Workshop on Semantic Evaluation.2010:256-259.
[9] DODDINGTON G R,MITCHELL A,PRZYBOCKI M A,et al.The Automatic Content Extraction (ACE) Program Tasks,Data,and Evaluation[C]//Language Resources and Evaluation.2004.
[10] HUANG X,YOU H L,YU Y.A Summary of Research on Relationship Extraction Technology [J].Modern Library and Information Technology ,2013,29(11):3039.
[11] XU F Y,USZKOREIT H,KRAUSE S,et al.Boosting Relation Extraction with Limited ClosedWorld Knowledge[C]//23rd International Conference on Computational Linguistics(COLING 2010).Beijing:Association for Computational Linguistics,2010.
[12] LI R J,ZHANG J,ZHANG X M,et al.Web information extraction in health field[J].Journal of Computer Applications,2016,36(1):163-170.
[13] ABNEYSP.Bootstrapping[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2002:360-367.
[14] YU L,LU F,LIU X L.Bootstrapping method for extractingopen geographic entity relations[J].Journal of Surveying and Mapping,2016,45(5):616-622.
[15] DENG M,XU R,LI Z L,et al.Research on the Transformation Method of Natural Language Spatial Relations and Metric Spatial Relations in Spatial Queries:Taking Area Targets as Examples[J].Journal of Surveying and Mapping,2009,38(6):527-531.
[16] MEI J J,ZHU Y M,GAO Y Q.Synonym Ci Lin (Second Edition)[M].Shanghai:Shanghai Dictionary Publishing House,1996.
[17] LI H G.Research on Chinese named entity relationship extraction based on location and semantic features[D].Hefei:Hefei University of Technology ,2011.
[18] CHEN C.Research on Internet-based binary entity relation extraction[D].Shanghai:East China Normal University,2013.
[19] LU S,BAI S.Quantitative description of the effective range of word context in natural language processing[J].Chinese Journal of Computers,2001,24(7):742-747.
[20] BUNKYOKU H,MATSUO Y,ISHIZUKA M.Relation Extraction from Wikipedia Using Subtree Mining Dat P.T.Nguyen[C]//National Conference on Artificial Intelligence.2013.
[21] SURDEANU M,TIBSHIRANI J,NALLAPATI R,et al.Multi-instance Multi-label Learning for Relation Extraction[C]//Joint Conference on Empirical Methods in Natural Language Processing &Computational Natural Language Learning.2012.
[22] CHE W,LI Z,LIU T.LTP:A Chinese Language Technology Platform[C]//23rd International Conference on Computational Linguistics,Demonstrations(COLING 2010).Beijing,China,2010.
[23] KLEIN D.Accurate Unlexicalized Parsing[C]//Proceedings of the 41st Meeting of the Association for Computational Linguistics.Sapporo,Japan,2003.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[3] 孔世明, 冯永, 张嘉云.
融合知识图谱的多层次传承影响力计算与泛化研究
Multi-level Inheritance Influence Calculation and Generalization Based on Knowledge Graph
计算机科学, 2022, 49(9): 221-227. https://doi.org/10.11896/jsjkx.210700144
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[6] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[7] 王杰, 李晓楠, 李冠宇.
基于自适应注意力机制的知识图谱补全算法
Adaptive Attention-based Knowledge Graph Completion
计算机科学, 2022, 49(7): 204-211. https://doi.org/10.11896/jsjkx.210400129
[8] 马瑞新, 李泽阳, 陈志奎, 赵亮.
知识图谱推理研究综述
Review of Reasoning on Knowledge Graph
计算机科学, 2022, 49(6A): 74-85. https://doi.org/10.11896/jsjkx.210100122
[9] 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓.
一种可快速迁移的领域知识图谱构建方法
Fast and Transmissible Domain Knowledge Graph Construction Method
计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[10] 杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥.
军事指控保障领域命名实体识别语料库的构建
Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support
计算机科学, 2022, 49(6A): 133-139. https://doi.org/10.11896/jsjkx.210400132
[11] 熊中敏, 舒贵文, 郭怀宇.
融合用户偏好的图神经网络推荐模型
Graph Neural Network Recommendation Model Integrating User Preferences
计算机科学, 2022, 49(6): 165-171. https://doi.org/10.11896/jsjkx.210400276
[12] 钟将, 尹红, 张剑.
基于学术知识图谱的辅助创新技术研究
Academic Knowledge Graph-based Research for Auxiliary Innovation Technology
计算机科学, 2022, 49(5): 194-199. https://doi.org/10.11896/jsjkx.210400195
[13] 陆亮, 孔芳.
面向对话的融入知识的实体关系抽取
Dialogue-based Entity Relation Extraction with Knowledge
计算机科学, 2022, 49(5): 200-205. https://doi.org/10.11896/jsjkx.210300198
[14] 朱敏, 梁朝晖, 姚林, 王翔坤, 曹梦琦.
学术引用信息可视化方法综述
Survey of Visualization Methods on Academic Citation Information
计算机科学, 2022, 49(4): 88-99. https://doi.org/10.11896/jsjkx.210300219
[15] 梁静茹, 鄂海红, 宋美娜.
基于属性图模型的领域知识图谱构建方法
Method of Domain Knowledge Graph Construction Based on Property Graph Model
计算机科学, 2022, 49(2): 174-181. https://doi.org/10.11896/jsjkx.210500076
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!