一种适用于复合术语的本体概念学习方法

Abstract

Abstract: Term extraction plays an important role in ontology concept learning based on text．Because of no clear boundary among words in Chinese text,domain terms,especially compound terms,are difficult to be extracted．Traditional term extraction methods usually need large amount of calculation and lack of semantic supporting．A novel ontologyconcept learning method for compound terms was presented in this paper．At first,natural language processing technology is utilized to remove the irrelevant parts to get candidate terms．Sentences in the text are cut by punctuation marks and removed parts,so that the candidate compound terms can be reserved from wrong cutting．The candidate domain-specific terms are filtered by term frequency and information entropy with multi-strategy,according to the characteristics of distribution and statistics of terms．Then domain-specific concept set is obtained after the synonymous terms recog-nition．Experimental results show that the method can extract domain-specific word terms and compound terms with higher precision.

Key words: Term extraction,Term filtering,Compound terms,Ontology concept learning

LI Jiang-hua,SHI Peng and HU Chang-jun. Ontology Concept Learning Method for Compound Terms[J].Computer Science, 2013, 40(5): 168-172.

References

[1] Borst W N.Construction of Engineering Ontologies for Knowled-ge Sharing and Reuse[D].University of Twente,Enschede,1997
[2] Gomez P A,Macho M D.An over view of methods and tools for ontology learning from texts[J].The Knowledge Engineering Review,2004,3(19):187-212
[3] Maedche A.Ontology Learning for the Semantic Web [M]．Boston:Kluwer Academic Publishers,2002
[4] Frantzi K T,Ananiadou S．The C-Value/ NC-Value Domain Independent Method for Multi-Word Term Extraction[J]．Journal of Natural Language Processing,1999,6(3):145-179
[5] Shamsfard M,Barforoush A A．Learning ontologies from natural language texts[J]．Int’l Journal Human-Computer Studies,2004,60(1):17-63
[6] Navigli R,Velardi P,Gangemi A.Ontology learning and its application to automated terminology translation[J]．IEEE Intelligent Systems,2003,18(1):22-31
[7] Maedche A,Staab S．Discovering Conceptual Relations FromText[C]∥Proc．European Conf．Artificial Intelligence(ECAI-00)．2000,1:321-325
[8] 陈文亮,朱靖波,姚天顺.基于BootstrapPing的领域词汇自动获取[C]∥第7届全国计算语言学联合学术会议论文集．哈尔滨,2003:67-72
[9] 张锋,许云,侯艳.基于互信息的中文术语抽取系统[J].计算机应用研究,2005,2(5):72-77
[10] 杜波,田怀凤,王立.基于多策略的专业领域术语抽取器的设计[J].计算机工程,2005,1(14):159-160
[11] 程勇.基于本体的不确定性知识管理研究[D].北京:中国科学院计算研究所,2005
[12] 刘柏嵩．基于Web的通用本体学习研究[D].杭州:浙江大学,2007
[13] 何婷婷,张勇.基于质子串分解的中文术语自动抽取[J].计算机工程,2006,2(23):188-190
[14] 张春霞.领域文本知识获取方法研究及其在考古领域中的应用[D]．北京:中国科学院计算研究所,2005
[15] 于娟,党延忠.结合词性分析与串频统计的词语提取方法[J].系统工程理论与实践,2010,0(1):105-111
[16] 赵军,黄昌宁.汉语基木名词短语结构分析模型[J].计算机学报,1999,2(2):141-146
[17] 董强,郝长伶,董振东.基于《知网》的中文信息结构抽取[EB/OL]．http://www.keenage.com/html/c_index．html,2010
[18] 刘桃,刘秉权,徐志明,等．领域术语自动抽取及其在文本分类中的应用[J].电子学报,2007,5(2):328-332
[19] 田久乐,赵蔚.基于同义词词林的词语相似度计算方法 [J].吉林大学学报,2010,8(6):602-608
[20] 董振东,董强.知网导论[EB/OL]．http://www．keenage.com/ html/c_index.html,2010
[21] 张玉芳,杨芬,熊忠阳.基于上下文的领域本体概念和关系的提取[J].计算机应用研究,2010,7(1):74-76

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Ontology Concept Learning Method for Compound Terms

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0