计算机科学 ›› 2016, Vol. 43 ›› Issue (3): 213-219.doi: 10.11896/j.issn.1002-137X.2016.03.039

• 软件与数据库技术 • 上一篇    下一篇

基于启发式规则的自动化本体扩充

李伊潇,李宏伟,沈立炜,赵文耘   

  1. 复旦大学计算机科学与技术学院 上海201203;上海市数据科学重点实验室复旦大学 上海201203,复旦大学计算机科学与技术学院 上海201203;上海市数据科学重点实验室复旦大学 上海201203;江西师范大学计算机信息工程学院 南昌330022,复旦大学计算机科学与技术学院 上海201203;上海市数据科学重点实验室复旦大学 上海201203,复旦大学计算机科学与技术学院 上海201203;上海市数据科学重点实验室复旦大学 上海201203
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家“863”高技术研究发展计划项目(2013AA01A605),国家自然科学基金项目(61402113)资助

Automatic Ontology Population Based on Heuristic Rules

LI Yi-xiao, LI Hong-wei, SHEN Li-wei and ZHAO Wen-yun   

  • Online:2018-12-01 Published:2018-12-01

摘要: 自动化地获取网络资源中的领域本体可以缩短本体的构建周期,但自动化的本体扩充还是本体工程中的一个挑战,其难点主要在于如何抽取术语并在新术语和已有本体之间建立映射关系。为此,提出了一个基于启发式规则的本体自动化扩充方法。该方法从网络资源中抽取自然语言文本,结合自然语言处理技术进行文本预处理,采用优先匹配对象属性的方式挖掘领域知识术语,然后通过启发式规则匹配术语的方式进行本体扩充,最后进行一致性检测。采用上述方法实现了一个基于Web的本体扩充工具。以城市景观信息核心本体作为研究案例进行了实验,结果显示本方法在扩充实例时具有较高的查准率和查全率,表明其具有有效性和可行性。

关键词: 本体扩充,领域本体,术语抽取,启发式规则

Abstract: The cycle of building ontology can be shortened by means of automatically extracting domain ontology in Internet resources,but automatic ontology population is still a challenge in ontology engineering.There are two difficulties in this area,which are how to extract terms and how to construct the mapping relationship between the new terms and the existed ontology.Therefore,this paper proposed a method for automatic ontology population based on the proposed heuristic rules.This method extracts natural language texts from the Internet,combines traditional natural language processing methods for text preprocessing,discovers domain terms by preferentially matching object properties,enriches the ontology by matching these terms using heuristic rules,and finally checks the consistency of the enriched ontology.On the base of the proposed method,this paper also implemented a Web-based tool for ontology population.Using an urban landscape information core ontology as a case study,the experimental results show that the method for enriching ontology individuals has a high precision and recall.The results also prove that the proposed method is effective and feasible.

Key words: Ontology population,Domain ontology,Term extraction,Heuristic rule

[1] OWL Overview Recommendation [EB/OL].[2014-12].http://www.w3.org/TR/2004/REC-owl-features-20040210
[2] Hazman M,El-Beltagy S R,Rafea A.A Survey of OntologyLearning Approaches[J].International Journal of Computer Applications,2011,22(9):36-43
[3] Santoso H A,Haw S C,Abdul-Mehdi Z T.Ontology Extraction from Relational Database:Concept Hierarchy as Background Knowledge[J].Knowledge-Based Systems,2011,24(3):457-464
[4] Wong W,Liu W,Bennamoun M.Ontology Learning from Text:A Look back and into the Future[J].ACM Computing Surveys (CSUR),2012,44(4):1-36
[5] Yang Jun-hui,Liu Zong-tian,Liu Wei,et al.Extraction Method of Text Summarization Based on Event Network [J].Computer Science,2015,2(3):210-213(in Chinese) 杨俊辉,刘宗田,刘炜,等.基于文本事件网络自动摘要的抽取方法[J].计算机科学,2015,2(3):210-213
[6] Petasis G,Karkaletsis V,Paliouras G,et al.Ontology Population and Enrichment:State of the Art[C]∥Knowledge-Driven Multimedia Information Extraction and Ontology Evolution,2011.Berlin:Springer-Verlag,2011:134-166
[7] WordNet[EB/OL].[2014-12].http://WordNet.princeton.edu
[8] Cunningham H,Maynard D,Bontcheva K,et al.A Framework and Graphical Development Environment for Robust NLP Tools and Applications[C]∥ACL,2002.ACM Press,2002:168-175
[9] Beautiful Soup [EB/OL].[2014-12].http://www.crummy.com/software/BeautifulSoup
[10] NLTK [EB/OL].[2014-12].http://www.nltk.org
[11] Stanford CoreNLP [EB/OL].[2014-12].http://nlp.stanford.edu/ software/corenlp.shtml
[12] Stanford Parser [EB/OL].[Dec.2014].http://nlp.stanford.edu /software/ lex-parser.shtml
[13] Zhou De-mao,Li Zhou-jun.Survey of High-Performance WebCrawler[J].Computer Science,2009,36(8):26-29(in Chinese) 周德懋,李舟军.高性能网络爬虫:研究综述[J].计算机科学,2009,36(8):26-29
[14] Davulcu H,Vadrevu S,Nagarajan S.OntoMiner:Bootstrapping Ontologies from Overlapping Domain Specific Web Sites[C]∥Proceedings of the 13th International World Wide Web Confe-rence on Alternate Track Papers & Posters,2004.ACM Press,2004:500-501
[15] Wang Chao,Li Shu-qin,Xiao Hong.Research on Literature-based Automatic Ontology Construction Method for Agricultural Domain[J].Computer Applications and Software,2014,31(8):71-74(in Chinese) 王超,李书琴,肖红.基于文献的农业领域本体自动构建方法研究[J].计算机应用与软件,2014,31(8):71-74
[16] Tang Qing,Lv Xue-qiang,Li Zhuo,et al.Research on Term Extraction for Domain Ontology[J].New Technology of Library and Information Service,2014,30(1):43-50(in Chinese) 汤青,吕学强,李卓,等.领域本体术语抽取研究[J].现代图书情报技术,2014,30(1):43-50
[17] Maynard D,Li Y,Peters W.NLP Techniques for Term Extraction and Ontology Population[C]∥Proceeding of the 2008 Conference on Ontology Learning and Population:Bridging the Gap between Text and Knowledge,2008.IEEE Press,2008:107-127
[18] Maynard D,Funk A,Peters W.SPRAT:A Tool for Automatic Semantic Pattern-Based Ontology Population[C]∥International Conference for Digital Libraries and the Semantic Web.2009
[19] Wu Y,Zhang S,Zhao W.Towards Learning Domain Ontologyfrom Legacy Documents:Digital Society,2010[C]∥Fourth International Conference on ICDS’10.IEEE Press,2010:164-171
[20] Sirin E,Parsia B,Grau B C,et al.Pellet:A Practical Owl-Dl Reasoner[J].Web Semantics:Science,Services and Agents on the World Wide Web,2007,5(2):51-53
[21] Chen Yu,Zhu Jian-feng,Wu Yi-jian,et al.New Term Expansion Method Based on Domain Ontology[J].Computer Engineering,2011,37(7):24-27(in Chinese) 陈宇,朱建锋,吴毅坚,等.一种基于领域本体的新术语扩充方法[J].计算机工程,2011,37(7):24-27
[22] Zablith F.Evolva:A Comprehensive Approach to Ontology Evolution[C]∥The Semantic Web:Research and Applications,2009.Berlin:Springer,2009:944-948
[23] Li Jiang-hua,Shi Peng,Hu Chang-jun.Ontology Concept Lear-ning Method for Compound Terms[J].Computer Science,2013,40(5):168-172(in Chinese) 李江华,时鹏,胡长军.一种适用于复合术语的本体概念学习方法[J].计算机科学,2013,40(5):168-172
[24] Gu Jun,Xu Xin.Study on Ontology Relation Extraction in Chinese Patent Documents[J].Computer Engineering,2013(10):73-78(in Chinese) 谷俊,许鑫.中文专利中本体关系获取研究[J].现代图书情报技术,2013(10):73-78
[25] Paiva L,Costa R,Figueiras P,et al.Discovering semantic relations from unstructured data for ontology enrichment:Asssociation rules based approach[C]∥2014 9th Iberian Conference on Information Systems and Technologies (CISTI).IEEE,2014:1-6
[26] Faria C,Serra I,Girardi R.A domain-independent process for automatic ontology population from text[J].Science of Compu-ter Programming,2014,95(1):26-43

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!