计算机科学 ›› 2020, Vol. 47 ›› Issue (5): 43-50.doi: 10.11896/jsjkx.200100129

所属专题: 理论计算机科学

• 理论计算机科学 • 上一篇    下一篇

决定图框架下本体学习算法的稳定性分析

朱林立1,2, 华钢2, 高炜3   

  1. 1 江苏理工学院计算机工程学院 江苏 常州213001
    2 中国矿业大学信息与控制工程学院 江苏 徐州221116
    3 云南师范大学信息学院 昆明650500
  • 收稿日期:2020-01-19 出版日期:2020-05-15 发布日期:2020-05-19
  • 通讯作者: 朱林立(zhulinli@jsut.edu.cn)
  • 基金资助:
    国家自然科学基金项目(51574232)

Stability Analysis of Ontology Learning Algorithm in Decision Graph Setting

ZHU Lin-li1,2, HUA Gang2, GAO Wei3   

  1. 1 School of Computer Engineering,Jiangsu University of Technology,Changzhou,Jiangsu 213001,China
    2 School of Information and Control Engineering,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
    3 School of Information,Yunnan Normal University,Kunming 650500,China
  • Received:2020-01-19 Online:2020-05-15 Published:2020-05-19
  • About author:ZHU Lin-li,born in 1975,Ph.D,senior engineer.His main research interests include artificial intelligence and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (51574232).

摘要: 传统的本体算法采用启发式的方法来计算语义相似度,而随着本体处理数据量的日益增大,越来越多的机器学习方法被用于本体函数的获取。稳定性是本体学习算法的必要条件,它要求在本体样本集做轻微改动的情况下不会对得到的最优本体函数产生本质的改变。文中研究了在本体样本集的依赖关系由图结构决定的框架下,本体学习算法的稳定性和对应的统计学特征。首先对传统的PO和LTO一致稳定性条件进行分析;其次在大样本情况下扩展一致稳定性条件,提出Pk和LkO一致稳定性并得到相关的理论结果;最后把替换本体样本和删除本体样本两种样本进行变换组合,提出在大本体样本前提下的组合一致稳定性概念,并利用统计学习理论的方法得到一般结果。此外,在各类稳定性条件下,对满足m-独立条件的本体学习算法的广义界进行了讨论。

关键词: 本体, 广义界, 机器学习, 稳定性, 样本容量

Abstract: Traditional ontology algorithms use heuristic tricks to calculate semantic similarity.With the increasing amount of data processed by ontology,more and more machine learning technologies are applied to get ontology functions.Stability is a necessary condition for ontology learning algorithms which requires that there is no substantial influence on the obtained optimal ontology function if the ontology sample set is slightly changed.This paper studies the stability and corresponding statistical characteristics of ontology learning algorithms in the setting that the dependency relation of ontology samples are characterized by a graph.Firstly,the traditional PO and LTO uniform stability conditions are analyzed.Then,the extended uniform stability conditions Pk and LkO for large samples are proposed,and related theoretical results are obtained.Finally,two sample transformations (replacement ontology samples and delete ontology samples) are combined to bring forward the concept of combined uniform stability in setting of large ontology samples,and general results are yielded by using statistical learning theory.In addition,under various stability conditions,the generalized bounds of ontology learning algorithms that satisfy the m-independent condition are discussed.

Key words: Generalized bound, Machine learning, Ontology, Sample capacity, Stability

中图分类号: 

  • TP391
[1]PALOMBI O,JOUANOT F,NZIENGAM N,et al.OntoSIDES:Ontology-based student progress monitoring on the national evaluation system of French Medical Schools[J].Artificial Intelligence in Medicine,2019,96:59-67.
[2]ARBABI A,ADAMS D R,FIDLER S,et al.Identifying clinical terms in medical text using ontology-guided machine learning[J].JMIR Medical Informatics,2019,7(2):191-205.
[3]MESSAOUDI R,JAZIRI F,MTIBAA A,et al.Ontology-based approach for liver cancer diagnosis and treatment[J].Journal of Digital Imaging,2019,32(1):116-130.
[4]THOMAS P D, HILL D P,MI H Y,et al.Gene OntologyCausal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems[J].Nature Genetics,2019,51(10):1429-1433.
[5]ZHAO Y W,FU G Y,WANG J,et al.Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing[J].Genomics,2019,111(3):334-342.
[6]HADAROVICH A,ANISHCHENKO I,TUZIKOV A V,et al.Gene ontology improves template selection in comparative protein docking[J].Proteins-Structure Function and Bioinforma-tics,2019,87(3):245-253.
[7]TANG H,FINN R D,THOMAS P D.TreeGrafter:phylogene-tic tree-based annotation of proteins with Gene Ontology terms and other annotations[J].Bioinformatics,2019,35(3):518-520.
[8]MAO Z T,MA H W.iMTBGO:An algorithm for integratingmetabolic networks with transcriptomes based on gene ontology analysis[J].Current Genomics,2019,20(4):252-259.
[9]VISSER U,STUCKENSCHMIDT H,SCHUSTER G,et al.Ontologies for geographic information processing[J].Computers & Geosciences,2002,28(1):103-117.
[10]FONSECA F,EGENHOFER M,DAVIS C,et al.Semanticgranularity in ontology-driven geographic information systems[J].Annals of Mathematics and Artificial Intelligence,2002,36(1/2):121-151.
[11]ZHU L L,HUA G,ASLAM A.Ontology learning algorithmusing weak functions[J].Open Physics,2018,16:910-916.
[12]ZHU L L,HUA G,ZAFAR S,et al.Fundamental ideas and mathematical basis of ontology learning algorithm[J].Journal of Intelligent and Fuzzy Systems,2018,35(4):4503-4516.
[13]XIAO Z H,YU H,WANG Y C.Unsupervised Machine Learning Based on the Sum of Squares of the Dynamic Deviations[J].Journal of Chongqing University of Technology(Natural Science),2018,32(11):134-139.
[14]YE S R,SUN N.Chinese Text Classification by Domain Ontology Graph Based on Concept Clustering [J].Computer Enginee-ring,2016,42(12):181-187.
[15]GAO W,XU T W.Stability analysis of learning algorithms for ontology similarity computation[J].Abstract and Applied Ana-lysis,2013.
[16]GAO W,ZHANG Y G,XU T W,et al.Analysis for learning a similarity function with ontology applications[J].Journal of Information & Computational Science,2012,17(9):5311-5318.
[17]CHEN L H Y.Two central limit problems for dependent random variables[J].Probability Theory and Related Fields,1978,43(3):223-243.
[18]BALDI P,RINOTT Y.On normal approximations of distribu-tions in terms of dependency graphs[J].The Annals of Probability,1989,17(4):1646-1650.
[19]ZHANG R,LIU X W,WANG Y Y,et al.McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds[C]//Thirty-third Conference on Neural Information Processing Systems.Vancouver,Canada.Science,2018:1-20.
[1] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[2] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[3] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[4] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[5] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[6] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[7] 张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究
Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[8] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[9] 李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述
Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[10] 赵璐, 袁立明, 郝琨.
多示例学习算法综述
Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[11] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[12] 王钰珏, 梁宇豪, 王素琴, 朱登明, 石敏.
机械零件加工工艺本体库构建
Construction of Ontology Library for Machining Process of Mechanical Parts
计算机科学, 2022, 49(6A): 661-666. https://doi.org/10.11896/jsjkx.210800013
[13] 肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究
Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[14] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[15] 许杰, 祝玉坤, 邢春晓.
机器学习在金融资产定价中的应用研究综述
Application of Machine Learning in Financial Asset Pricing:A Review
计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!