计算机科学 ›› 2017, Vol. 44 ›› Issue (12): 48-51.doi: 10.11896/j.issn.1002-137X.2017.12.009

• 第四届CCF大数据学术会议 • 上一篇    下一篇

基于概念权重向量的MIMLSVM改进算法

环天,郝宁,牛强   

  1. 中国矿业大学计算机科学与技术学院 徐州221116,中国矿业大学计算机科学与技术学院 徐州221116,中国矿业大学计算机科学与技术学院 徐州221116
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受江苏省产学研联合创新资金前瞻性联合研究项目(BY2014028-09)资助

Improved MIMLSVM Algorithm Based on Concept Weight Vector

HUAN Tian, HAO Ning and NIU Qiang   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对多示例多标记学习算法MIMLSVM只从包层面构造聚类,而忽略了包内示例分布对分类造成影响这一不足,提出一种基于概念权重向量的MIMLSVM改进算法——I-MIMLSVM算法。首先从示例层面构造聚类,挖掘出示例中的潜在概念簇,运用R-PATTERN算法计算每个概念簇的概念权重;然后利用TF-IDF算法计算每个概念簇在各个示例包中的重要度;最后将示例包表示为概念权重向量,向量的每一维即为概念簇的概念权重与其在该包中的重要度的乘积。将该算法在包含2000幅图像的自然数据集上进行实验验证,结果表明改进的算法在分类性能上整体优于原算法,尤其在Hamming loss,Coverage和Average precision这3个测评指标上较为明显。

关键词: MIMLSVM,聚类,R-PATTERN,TF-IDF

Abstract: In order to solve the problem that the MIMLSVM algorithm only constructs cluster from the bag level,while ignoring the distribution of the instance in the bag,this article proposed an improved MIMLSVM algorithm I-MIMLSVM.Firstly,we constructed clustering from the sample level,and explored the potential cluster of concepts in the example.Then,we used R-PATTERN algorithm to calculate the weight of each concept cluster,and calculated the importance degree of each concept cluster in each bag with TF-IDF algorithm.Finally,each bag was represented as a concept vector,and each dimension of the vector was equal to the multiplication of the weight of each concept cluster and its importance degree in this bag.The natural data set containing 2000 images was used in our experiments.The experimental results show that the improved algorithm performs better than the original algorithm,especially in the Harming loss,Coverage and Average precision.

Key words: MIMLSVM,Cluster,R-PATTERN,TF-IDF

[1] NATARAJAN B B K.Machine Learning:A Theoretical Ap-proach [J].IEEE Expert Intelligent Systems & Their Applications,2015,7(4):89-90.
[2] COHEN M E,HUDSON D L.Neural Networks and Artificial Intelligence for Biomedical Engineering [M].Wiley-IEEE Press,1999:10676-10681.
[3] MURPHY K P.Machine Learning:A Probabilistic Perspective[J].Mathematics Education Library,2012,8(8):27-71.
[4] ZHOU Z H,ZHANG M L.Multi-instance multi-label learning with application to scene classification[C]∥The 2006 Confe-rence Advances in Neural Information Processing Systems.Cambridge:MIT Press,2007:1609-1616.
[5] ZHANG D,HE J R,LAWRENCE R.MI2LS:multi-instancelearning from multiple information sources[C]∥The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2013:149-157.
[6] ZHOU Z H,ZHANG M L.A Review on Multi-Label Learning Algorithms [J].IEEE Transactions on Knowledge and Data Engineering,2014,6(8):1819-1837.
[7] GIBAJA E.A Tutorial on Multi-Label Learning [J].ACMComputing Surveys,2015,7(3):1-38.
[8] ZHANG M L,ZHOU Z H.M3MIML:A Maximum MarginMethod for Multi-instance Multi-label Learning [J].Eighth IEEE International Conference on Data Mining,2008,6(1):688-697.
[9] ZHOU Z H,ZHANG M L,HUANG S J.Multi-instance Multi-label Learning [J].Artificial Intelligence,2008,6(1):2291-2320.
[10] NGUYEN C T,WANG X L,LIU J.Labeling complicated objects:Multi-view multi-instance multi-label learning[C]∥The 28th AAAI Conference on Artificial Intelligence Quebec City.Canada:AAAI Press,2014:2013-2019.
[11] NGUYEN C T,ZHAN D C,ZHOU Z H.Multi-modal image annotation with Multi-instance Multi-label LDA[C]∥Internatio-nal Joint Conference on Artificial Intelligence.Beijing:IJCAI,2013:1558-1564.
[12] ZHOU Z H,ZHANG M L,HUANG S J,et al.MIML:AFramework for Learning with Ambiguous Objects[J].Corr Abs,2008,1(1):2012.
[13] BOUTELL M R,LUO J B,SHEN X P.Learning Multi-labelScene Classification [J].Pattern Recognition,2004,7(9):1757-1771.
[14] KNAUER C,LFFLER M,SCHERFENBERG M.The directed Hausdorff distance between imprecise point sets [J].Theoretical Computer Science,2011,2(32):4173-4186.
[15] LIBERTI L,LAVOR C,MACULAN N.Euclidean distance geo-metry and applications [J].Siam Review,2012,6(1):3-69.
[16] KIM S,CHOI J.An SVM-based high-quality article classifier for systematic reviews [J].Journal of Biomedical Informatics,2014,7(2):153-159.
[17] ALIK K R,ALIK B.Validity Index for Clusters of Different Sizes and Densities [J].Pattern Recognition Letters,2011,2(2):221-234.
[18] HONG T P,LIN C W,YANG K T,et al.Using TF-IDF to Hide Sensitive Itemsets[J].Applied Intelligence,2013,8(4):502-510.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!