计算机科学 ›› 2020, Vol. 47 ›› Issue (5): 43-50.doi: 10.11896/jsjkx.200100129

• 理论计算机科学 • 上一篇    下一篇

决定图框架下本体学习算法的稳定性分析

朱林立1,2, 华钢2, 高炜3   

  1. 1 江苏理工学院计算机工程学院 江苏 常州213001
    2 中国矿业大学信息与控制工程学院 江苏 徐州221116
    3 云南师范大学信息学院 昆明650500
  • 收稿日期:2020-01-19 出版日期:2020-05-15 发布日期:2020-05-19
  • 通讯作者: 朱林立(zhulinli@jsut.edu.cn)
  • 基金资助:
    国家自然科学基金项目(51574232)

Stability Analysis of Ontology Learning Algorithm in Decision Graph Setting

ZHU Lin-li1,2, HUA Gang2, GAO Wei3   

  1. 1 School of Computer Engineering,Jiangsu University of Technology,Changzhou,Jiangsu 213001,China
    2 School of Information and Control Engineering,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
    3 School of Information,Yunnan Normal University,Kunming 650500,China
  • Received:2020-01-19 Online:2020-05-15 Published:2020-05-19
  • About author:ZHU Lin-li,born in 1975,Ph.D,senior engineer.His main research interests include artificial intelligence and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (51574232).

摘要: 传统的本体算法采用启发式的方法来计算语义相似度,而随着本体处理数据量的日益增大,越来越多的机器学习方法被用于本体函数的获取。稳定性是本体学习算法的必要条件,它要求在本体样本集做轻微改动的情况下不会对得到的最优本体函数产生本质的改变。文中研究了在本体样本集的依赖关系由图结构决定的框架下,本体学习算法的稳定性和对应的统计学特征。首先对传统的PO和LTO一致稳定性条件进行分析;其次在大样本情况下扩展一致稳定性条件,提出Pk和LkO一致稳定性并得到相关的理论结果;最后把替换本体样本和删除本体样本两种样本进行变换组合,提出在大本体样本前提下的组合一致稳定性概念,并利用统计学习理论的方法得到一般结果。此外,在各类稳定性条件下,对满足m-独立条件的本体学习算法的广义界进行了讨论。

关键词: 本体, 机器学习, 稳定性, 样本容量, 广义界

Abstract: Traditional ontology algorithms use heuristic tricks to calculate semantic similarity.With the increasing amount of data processed by ontology,more and more machine learning technologies are applied to get ontology functions.Stability is a necessary condition for ontology learning algorithms which requires that there is no substantial influence on the obtained optimal ontology function if the ontology sample set is slightly changed.This paper studies the stability and corresponding statistical characteristics of ontology learning algorithms in the setting that the dependency relation of ontology samples are characterized by a graph.Firstly,the traditional PO and LTO uniform stability conditions are analyzed.Then,the extended uniform stability conditions Pk and LkO for large samples are proposed,and related theoretical results are obtained.Finally,two sample transformations (replacement ontology samples and delete ontology samples) are combined to bring forward the concept of combined uniform stability in setting of large ontology samples,and general results are yielded by using statistical learning theory.In addition,under various stability conditions,the generalized bounds of ontology learning algorithms that satisfy the m-independent condition are discussed.

Key words: Ontology, Machine learning, Stability, Sample capacity, Generalized bound

中图分类号: 

  • TP391
[1] PALOMBI O,JOUANOT F,NZIENGAM N,et al.OntoSIDES:Ontology-based student progress monitoring on the national evaluation system of French Medical Schools[J].Artificial Intelligence in Medicine,2019,96:59-67.
[2] ARBABI A,ADAMS D R,FIDLER S,et al.Identifying clinical terms in medical text using ontology-guided machine learning[J].JMIR Medical Informatics,2019,7(2):191-205.
[3] MESSAOUDI R,JAZIRI F,MTIBAA A,et al.Ontology-based approach for liver cancer diagnosis and treatment[J].Journal of Digital Imaging,2019,32(1):116-130.
[4] THOMAS P D, HILL D P,MI H Y,et al.Gene OntologyCausal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems[J].Nature Genetics,2019,51(10):1429-1433.
[5] ZHAO Y W,FU G Y,WANG J,et al.Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing[J].Genomics,2019,111(3):334-342.
[6] HADAROVICH A,ANISHCHENKO I,TUZIKOV A V,et al.Gene ontology improves template selection in comparative protein docking[J].Proteins-Structure Function and Bioinforma-tics,2019,87(3):245-253.
[7] TANG H,FINN R D,THOMAS P D.TreeGrafter:phylogene-tic tree-based annotation of proteins with Gene Ontology terms and other annotations[J].Bioinformatics,2019,35(3):518-520.
[8] MAO Z T,MA H W.iMTBGO:An algorithm for integratingmetabolic networks with transcriptomes based on gene ontology analysis[J].Current Genomics,2019,20(4):252-259.
[9] VISSER U,STUCKENSCHMIDT H,SCHUSTER G,et al.Ontologies for geographic information processing[J].Computers & Geosciences,2002,28(1):103-117.
[10] FONSECA F,EGENHOFER M,DAVIS C,et al.Semanticgranularity in ontology-driven geographic information systems[J].Annals of Mathematics and Artificial Intelligence,2002,36(1/2):121-151.
[11] ZHU L L,HUA G,ASLAM A.Ontology learning algorithmusing weak functions[J].Open Physics,2018,16:910-916.
[12] ZHU L L,HUA G,ZAFAR S,et al.Fundamental ideas and mathematical basis of ontology learning algorithm[J].Journal of Intelligent and Fuzzy Systems,2018,35(4):4503-4516.
[13] XIAO Z H,YU H,WANG Y C.Unsupervised Machine Learning Based on the Sum of Squares of the Dynamic Deviations[J].Journal of Chongqing University of Technology(Natural Science),2018,32(11):134-139.
[14] YE S R,SUN N.Chinese Text Classification by Domain Ontology Graph Based on Concept Clustering [J].Computer Enginee-ring,2016,42(12):181-187.
[15] GAO W,XU T W.Stability analysis of learning algorithms for ontology similarity computation[J].Abstract and Applied Ana-lysis,2013.
[16] GAO W,ZHANG Y G,XU T W,et al.Analysis for learning a similarity function with ontology applications[J].Journal of Information & Computational Science,2012,17(9):5311-5318.
[17] CHEN L H Y.Two central limit problems for dependent random variables[J].Probability Theory and Related Fields,1978,43(3):223-243.
[18] BALDI P,RINOTT Y.On normal approximations of distribu-tions in terms of dependency graphs[J].The Annals of Probability,1989,17(4):1646-1650.
[19] ZHANG R,LIU X W,WANG Y Y,et al.McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds[C]//Thirty-third Conference on Neural Information Processing Systems.Vancouver,Canada.Science,2018:1-20.
[1] 李吟, 李必信. 基于脚本预测和重组的内存泄漏测试加速技术[J]. 计算机科学, 2020, 47(9): 31-39.
[2] 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59.
[3] 苏畅, 张定权, 谢显中, 谭娅. 面向5G通信网络的NFV内存资源管理方法[J]. 计算机科学, 2020, 47(9): 246-251.
[4] 王慧, 乐孜纯, 龚轩, 武玉坤, 左浩. 基于特征分类的链路预测方法综述[J]. 计算机科学, 2020, 47(8): 302-312.
[5] 袁野, 和晓歌, 朱定坤, 王富利, 谢浩然, 汪俊, 魏明强, 郭延文. 视觉图像显著性检测综述[J]. 计算机科学, 2020, 47(7): 84-91.
[6] 王萌, 丁志军. 一种新的设备指纹特征选择及模型构建方法[J]. 计算机科学, 2020, 47(7): 257-262.
[7] 杨威超, 郭渊博, 李涛, 朱本全. 基于流量指纹的物联网设备识别方法和物联网安全模型[J]. 计算机科学, 2020, 47(7): 299-306.
[8] 彭伟, 胡宁, 胡璟璟. 图像隐写分析算法研究概述[J]. 计算机科学, 2020, 47(6A): 325-331.
[9] 包振山, 郭俊南, 谢源, 张文博. 基于LSTM-GA的股票价格涨跌预测模型[J]. 计算机科学, 2020, 47(6A): 467-473.
[10] 徐江峰谭玉龙. 基于机器学习的HBase配置参数优化研究[J]. 计算机科学, 2020, 47(6A): 474-479.
[11] 王铁鑫, 李文心, 曹静雯, 杨志斌, 黄志球, 王飞. 知识驱动的企业间协作参与者动态推荐方法[J]. 计算机科学, 2020, 47(6): 210-218.
[12] 蹇松雷, 卢凯. 复杂异构数据的表征学习综述[J]. 计算机科学, 2020, 47(2): 1-9.
[13] 刘苗苗,扈庆翠,郭景峰,陈晶. 符号网络链接预测算法研究综述[J]. 计算机科学, 2020, 47(2): 21-30.
[14] 周畅,陆慧梅,向勇,吴竞邦. 区块链在车载自组网中的应用研究及展望[J]. 计算机科学, 2020, 47(2): 213-220.
[15] 刘云,尹传环,胡迪,赵田,梁宇. 基于循环神经网络的通信卫星故障检测[J]. 计算机科学, 2020, 47(2): 227-232.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[3] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[4] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[5] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[6] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[7] 刘博艺,唐湘滟,程杰仁. 基于多生长时期模板匹配的玉米螟识别方法[J]. 计算机科学, 2018, 45(4): 106 -111 .
[8] 耿海军,施新刚,王之梁,尹霞,尹少平. 基于有向无环图的互联网域内节能路由算法[J]. 计算机科学, 2018, 45(4): 112 -116 .
[9] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[10] 王振朝,侯欢欢,连蕊. 抑制CMT中乱序程度的路径优化方案[J]. 计算机科学, 2018, 45(4): 122 -125 .