计算机科学 ›› 2018, Vol. 45 ›› Issue (1): 173-178.doi: 10.11896/j.issn.1002-137X.2018.01.030

• 第十六届中国机器学习会议 • 上一篇    下一篇

一种基于邻域粗糙集的多标记专属特征选择方法

孙林,潘俊方,张霄雨,王伟,徐久成   

  1. 计算智能与数据挖掘河南省高校工程技术研究中心 河南 新乡453007,电子科技大学基础与前沿研究院 成都610054,河南师范大学计算机与信息工程学院 河南 新乡453007,河南师范大学计算机与信息工程学院 河南 新乡453007,河南师范大学计算机与信息工程学院 河南 新乡453007;河南师范大学生命科学学院生物学博士后流动站 河南 新乡453007
  • 出版日期:2018-01-15 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目(61772176,61402153,9,61602158),中国博士后科学基金项目(2016M602247),河南省科技攻关项目(162102210261),新乡市科技攻关计划项目(CXGG17002),河南师范大学博士科研启动费支持课题(qd15132)资助

Multi-label-specific Feature Selection Method Based on Neighborhood Rough Set

SUN Lin, PAN Jun-fang, ZHANG Xiao-yu, WANG Wei and XU Jiu-cheng   

  • Online:2018-01-15 Published:2018-11-13

摘要: 在多标记学习中,数据降维是一项重要且具有挑战性的任务,而特征选择又是一种高效的数据降维技术。在邻域粗糙集理论的基础上提出一种多标记专属特征选择方法,该方法从理论上确保了所得到的专属特征与相应标记具有较强的相关性,进而改善了约简效果。首先,该方法运用粗糙集理论的约简算法来减少冗余属性,在保持分类能力不变的情况下获得标记的专属特征;然后,在邻域精确度和邻域粗糙度概念的基础上,重新定义了基于邻域粗糙集的依赖度与重要度的计算方法,探讨了该模型的相关性质;最后,构建了一种基于邻域粗糙集的多标记专属特征选择模型,实现了多标记分类任务的特征选择算法。在多个公开的数据集上进行仿真实验,结果表明了该算法是有效的。

关键词: 多标记学习,邻域粗糙集,专属特征,特征选择

Abstract: Dimensionality reduction of data is a significant and challenging task under multi-label learning,and feature selection is a valid technology to reduce the dimension of vector.In this paper,a multi-label-specific feature selection method based on neighborhood rough set theory was proposed.This method ensures theoretically that there exists a strong correlation between the obtained label-specific features and the corresponding labels,and then reduction efficiency can be improved well.Firstly,a reduction algorithm of rough set theory is applied to reduce redundant attributes,and the label-specific features are obtained while keeping the classification ability unchanged.Then,the concepts of neighborhood accuracy and neighborhood roughness are introduced,the calculation approaches to dependence and attribute significance based on neighborhood rough set are redefined,and the related properties of this model are discussed.Finally,a multi-label-specific feature selection model based on neighborhood rough set is presented,and the corresponding feature selection algorithm for multi-label classification task is designed.The experimental results under some public datasets demonstrate the effectiveness of the proposed multi-label-specific feature selection method.

Key words: Multi-label learning,Neighborhood rough set,Label-specific feature,Feature selection

[1] LI F,MIAO D Q,PEDRYCZ W.Granular multi-label feature selection based on mutual information[J].Pattern Recognition,2017,67(C):410-423.
[2] HYUNKI L,JAESUNG L,DAE-WON K.Optimization ap-proach for feature selection in multi-label classification[J].Pattern Recognition Letters,2017,89(C):25-30.
[3] DUAN J,HU Q H,ZHANG L J,et al.Feature selection formulti-label classification based on neighborhood rough sets[J].Journal of Computer Research and Development,2015,52(1):56-65.(in Chinese) 段洁,胡清华,张灵均,等.基于邻域粗糙集的多标记分类特征选择算法[J].计算机研究与发展,2015,52(1):56-65.
[4] LIN Y J,HU Q H,LIU J H,et al.Multi-label feature selection based on max-dependency and min-redundancy[J].Neurocomputing,2015,168(C):92-103.
[5] LI H,LI D Y,WANG S G,et al.Multi-label learning with label-specific features based on rough sets[J].Journal of Chinese Computer Systems,2015,36(12):2730-2734.(in Chinese) 李华,李德玉,王素格,等.基于粗糙集的多标记专属特征学习算法[J].小型微型计算机系统,2015,36(12):2730-2734.
[6] LIU J H,LIN M L,WANG C X,et al.Multi-label feature selection algorithm based on local subspace[J].Pattern Recognition &Artificial Intelligence,2016,29(3):240-251.(in Chinese) 刘景华,林梦雷,王晨曦,等.基于局部子空间的多标记特征选择算法[J].模式识别与人工智能,2016,29(3):240-251.
[7] SUN L,JI S W,YE J P.Multi-Label Dimensionality Reduction[M].Florida:CRC Press,2013:20-22.
[8] FISHER R A.The use of multiple measurements in taxonomic problems[J].Annals of Human Genetics,1936,7(2):179-188.
[9] WOLD H.Estimation of principal components and related mo-dels by iterative least squares[J].Multivariate Analysis,1966(1):391-420.
[10] ZHANG Y,ZHOU Z H.Multi-label dimensionality reductionvia dependence maximization[J].ACM Transactions on Know-ledge Discovery from Data,2010,4(3):14-20.
[11] ZHANG M L,PENA JOS M,ROBLES V.Feature selection for multi-label nave Bayes classification[J].Information Scien-ces,2009,179(19):3218-3229.
[12] GE L,LI G Z,YOU M Y.Embedded feature selection for multi-label learning[J].Journal of Nanjing University (Natural Scien-ces),2009,45(5):671-676.(in Chinese) 葛雷,李国正,尤鸣宇.多标记学习的嵌入式特征选择[J].南京大学学报(自然科学),2009,45(5):671-676.
[13] ZHANG Z H,LI S N,LI Z G,et al.Multi-label feature selection algorithm based on information entropy[J].Journal of Computer Research and Development,2013,50(6):1177-1184.(in Chinese) 张振海,李士宁,李志刚,等.一种基于信息熵的多标签特征选择算法[J].计算机研究与发展,2013,50(6):1177-1184.
[14] SUN L,LIU R N,ZHANG X Y,et al.A fuzzy biclustering approach based on rough mean square residue[J].Journal of Henan Normal University(Natural Science Edition),2017,45(5):93-100.(in Chinese) 孙林,刘弱南,张霄雨,等.一种基于粗糙均方残基的模糊双聚类方法[J].河南师范大学学报(自然科学版),2017,45(5):93-100.
[15] HU Q H,ZHAO H,YU D R.Efficient symbolic and numerical attribute reduction with neighborhood rough sets[J].Pattern Recognition & Artificial Intelligence,2008,21(6):732-738.(in Chinese) 胡清华,赵辉,于达仁.基于邻域粗糙集的符号与数值属性快速约简算法[J].模式识别与人工智能,2008,21(6):732-738.
[16] HU Q H,YU D R,LIU J F,et al.Neighborhood rough setbased heterogeneous feature subset selection[J].Information Sciences,2008,178(18):3577-3594.
[17] XUE Z A,WANG N,SI X M,et al.Research on multi-granulari-ty rough intuitionistic fuzzy cut sets[J].Journal of Henan Normal University (Natural Science Edition),2016,44(5):131-139.(in Chinese) 薛占熬,王楠,司小朦,等.多粒度粗糙直觉模糊截集的研究[J].河南师范大学学报(自然科学版),2016,44(5):131-139.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!