计算机科学 ›› 2019, Vol. 46 ›› Issue (12): 250-256.doi: 10.11896/jsjkx.181102031

• 人工智能 • 上一篇    下一篇

面向多尺度的属性约简加速器

姜泽华1, 王怡博2, 徐刚3, 杨习贝1, 王平心4   

  1. (江苏科技大学计算机学院 江苏 镇江212003)1;
    (东南大学计算机科学与工程学院 南京211189)2;
    (江苏科技大学船舶与海洋工程学院 江苏 镇江212003)3;
    (江苏科技大学理学院 江苏 镇江212003)4
  • 收稿日期:2018-11-04 出版日期:2019-12-15 发布日期:2019-12-17
  • 通讯作者: 杨习贝(1980-),男,博士后,副教授,主要研究方向为粒计算、粗糙集与机器学习,E-mail:jsjxy_yxb@just.edu.cn。
  • 作者简介:姜泽华(1997-),女,硕士生,主要研究方向为粗糙集、粒计算;王怡博(1999-),男,主要研究方向为粗糙集、粒计算;徐刚(1981-),男,博士,副教授,主要研究方向为智能信息处理;王平心(1980-),男,博士后,副教授,主要研究方向为三支决策、粒计算。
  • 基金资助:
    本文受国家自然科学基金项目(61572242,61502211,61503160)资助。

Multi-scale Based Accelerator for Attribute Reduction

JIANG Ze-hua1, WANG Yi-bo2, XU Gang3, YANG Xi-bei1, WANG Ping-xin4   

  1. (School of Computer Science,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212003,China)1;
    (School of Computer Science and Engineering,Southeast University,Nanjing 211189,China)2;
    (School of Naval Architecture and Ocean Engineering,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212003,China)3;
    (School of Science,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212003,China)4
  • Received:2018-11-04 Online:2019-12-15 Published:2019-12-17

摘要: 邻域粗糙集,采用半径的方式度量样本之间是否相似,因而不同大小的半径自然地构成了不同尺度意义下的粗糙近似。基于邻域粗糙集的属性约简问题往往需要在多个不同半径上求解约简,其目的是找到具有较好泛化性能的属性子集,或探讨不同尺度意义下约简性能的变化趋势。但值得注意的是,利用传统的启发式算法在多个半径所对应的多尺度意义下进行约简求解时,往往需要在所有尺度上逐一重复执行这一算法,时间消耗较大,特别是尺度个数较多的情况下,时间消耗会变得更高。为解决这一问题,借助半径的变化,文中提出了面向多尺度的约简求解加速策略。这一策略在分别考虑半径从小到大和从大到小的变化趋势的情况下,同时缩小了样本和属性的遍历规模,将当前半径下约简的求解过程建立在上一个半径所求得约简的基础上,利用启发式搜索进行正向或逆向的属性增加及删除操作。为验证所提加速策略的有效性,实验选取8个UCI数据集,采用十折交叉验证的方法求取20个半径下的约简,对比不同方法求解约简的时间消耗和分类性能。实验结果表明,与利用传统的启发式算法在每一个尺度意义下单独求解约简的方法相比较,文中所提出的正向或逆向加速搜索方法可以在保持分类性能不发生显著变化的情况下,极大地降低多尺度意义下求解约简的时间消耗,并且有效地降低过拟合的程度。

关键词: 多尺度, 邻域粗糙集, 启发式搜索, 属性约简

Abstract: The neighborhood rough set measures the similarity between samples by radius,consequently,different radii naturally construct the rough approximations with different scales.Traditional attribute reduction based on neighborhood rough set is frequently executed over multi-radius.The aims are to select an attribute subset with better generalization performance or to explore the trend line of the performances of the reducts in terms of different scales.However,it should be emphasized that the process should be repeatedly executed for each scale,if the traditional algorithm based on heuristic searching is used to compute the multiple scale reducts.The time consumption of computing reducts is too high to be accepted,especially the number of the scale is more,the time consumption will be more.To solve such a problem,the multi-scale based accelerated strategy for attribute reduction was proposed by means of the changing of radius.This strategy considers two trends of changing radius,from smaller radius to greater radius and from greater radius to smaller radius.Moreover,the traversal size of samples and attributes is reduced.Therefore,the current process to find reduct is executed based on the reduct related to the previous radius,which follows that the forward or backward heuristic searching for adding and deleting attributes can be realized.To validate the effectiveness of the accelerated strategy,8 UCI data sets,10-fold cross-validation and 20 radii were employed to conduct the experiment,and the time consumption of computing different reducts and the classification of different reducts were compared.The experimental results over 8 UCI data sets show that the proposed forward or backward accelerated searching can significantly reduce the time consumptions of finding reducts if the case of multi-scale is considered.Moreover,the proposed approach will not significantly decrease the classification performance,and can reduce the degree of over-fitting.

Key words: Attribute reduction, Heuristic searching, Multi-scale, Neighborhood rough set

中图分类号: 

  • TP181
[1]PAWLAK Z.Rough Sets[J].International Journal of Computer & Information Sciences,1982,11(5):341-356.
[2]DAI J H,GAO S C,ZHENG G J.Generalized Rough Set Models Determined by Multiple Neighborhoods Generated from a Similarity Relation[J].Soft Computing,2018,22(7):2081-2094.
[3]QIAN Y H,LIANG X Y,WANG Q,et al.Local Rough Set:A Solution to Rough Data Analysis in Big Data[J].International Journal of Approximate Reasonging,2018,97(1):38-63.
[4]HU Q H,AN S,YU D R.Soft Fuzzy Rough Sets for Robust Feature Evaluation and Selection[J].Information Sciences,2010,180(22):4384-4400.
[5]MIAO D Q,GAO C,ZHANG N,et al.Diversereduct Subspaces Based Co-training for Partially Labeled Data[J].International Journal of Approximate Reasoning,2011,52(8):1103-1117.
[6]CHEN H M,LI T R,LUO C,et al.A Decision-theoretic Rough Set Approach for Dynamic Data Mining[J].IEEE Transactions on Fuzzy Systems,2015,23(6):1958-1970.
[7]TAO F,MI J S.Variable Precision Multigranulation Decision- theoretic Fuzzy Rough Sets[J].Knowledge-Based Systems,2016,91(1):93-101.
[8]QIAN Y H,CHENG H H,WANG J T,et al.Grouping Granular Structures in Human Granulation Intelligence[J].Information Sciences,2017,382/383:150-169.
[9]CHEN Y F,YUE X D,FUJITA H.Three-way Decision Support for Diagnosis on Focal Liver Lesions[J].Knowledge-Based Systems,2017,127(C):85-99.
[10]LIU D,LIANG D C,WANG C C.A Novel Three-way Decision Model Based on Incomplete Information System[J].Knowledge-Based Systems,2016,91(C):32-45.
[11]HU Q H,PAN W W,ZHANG L,et al.Feature Selection for Monotonic Classification[J].IEEE Transactions on Fuzzy Systems,2012,20(1):69-81.
[12]JIA X Y,LIAO W H,TANG Z M,et al.Minimum Cost Attri- bute Reduction in Decision-theoretic Rough Set Models[J].Information Sciences,2013,219(1):151-167.
[13]MIN F,ZHU W.Attribute Reduction of Data with Error Ranges and Test Costs[J].Information Sciences,2012,211:48-67.
[14]SONG J J,TSANG E C C,CHEN D G,et al.Minimal Decision Cost Reduct in Fuzzy Decision-theoretic Rough Set Model[J].Knowledge-Based Systems,2017,126(C):104-112.
[15]JU H R,LI H X,YANG X B,et al.Cost-sensitive Rough Set:A Multi-granulation Approach[J].Knowledge-Based Systems,2017,123(1):137-153.
[16]YAO Y Y,ZHANG X Y.Class-specific Attribute Reducts in Rough Set Theory[J].Information Sciences,2017,418/419:601-618.
[17]MIN F,HE H P,QIAN Y H,et al.Test-cost-sensitive Attribute Reduction[J].Information Sciences,2011,181(22):4928-4942.
[18]YANG X B,QI Y S,SONG X N,et al.Test Cost Sensitive Multigranulation Rough Set:Model and Minimal Cost Selection[J].Information Sciences,2013,250(11):184-199.
[19]CHEN D G,YANG Y Y,DONG Z.An Incremental Algorithm for Attribute Reduction with Variable Precision Rough Sets[J].Applied Soft Computing,2016,45(1):129-149.
[20]FAN J,JIANG Y L,LIU Y.Quick Attribute Reduction with Generalized Indiscernibility Models[J].Information Sciences,2017,397/398:15-36.
[21]YAO Y Y,ZHAO Y.Discernibility Matrix Simplification for Constructing Attribute Reducts[J].Information Sciences,2009,179(7):867-882.
[22]WU W Z,QIAN Y H,LI T J,et al.On Rule Acquisition in Incomplete Multi-scale Decision Tables[J].Information Sciences,2017,378(C):282-302.
[23]ZHANG L,ZHANG B.Theory of Fuzzy Quotient Space(Methods of Fuzzy Granular Computing)[J].Journal of Software,2003,14(4):770-776.(in Chinese)
张铃,张拔.模糊熵空间理论(模糊粒度计算方法)[J].软件学报,2003,14(4):770-776.
[24]SUN L,PAN J F,ZHANG X Y,et al.Multi-label-specific Feature Selection Method Based on Neighborhood Rough Set[J].Computer Science,2018,45(1):173-178.(in Chinese)
孙林,潘俊方,张霄雨,等.一种基于邻域粗糙集的多标记专属特征选择方法[J].计算机科学,2018,45(1):173-178.
[25]XU S P,YANG X B,YU H L,et al.Neighborhood Collaborative Representation Based Classification Method[J].Computer Scie-nce,2017,44(9):234-238.(in Chinese)
徐苏平,杨习贝,于化龙,等.一种基于邻域协同表达的分类方法[J].计算机科学,2017,44(9):234-238.
[26]HU Q H,YU D R,XIE Z X.Neighborhood Classifiers[J].Expert Systems with Applications,2008,34(2):866-876.
[27]YANG X B,LIANG S C,YU H L,et al.Pseudo-label Neighbor- hood Rough Set:Measures and Attribute Reductions[J].International Journal of Approximate Reasoning,2019,105(1):115-129.
[28]WEI W,WEI Q,WANG F.Comparative Study of Uncertainty Measure in Rough Set[J].Journal of Nanjing University(Natural Sciences),2015,51(4):714-722.(in Chinese)
魏巍,魏琪,王锋.粗糙集的不确定性度量比较研究[J].南京大学学报(自然科学版),2015,51(4):714-722.
[29]ZHANG X,MEI C L,CHEN D G,et al.Feature Selection in Mixed Data:A Mehhod Using a Novel Fuzzy Rough Set-based Information Entropy[J].Pattern Recognition,2016,56(1):1-15.
[30]GAO Y,CHEN X J,YANG X B,et al.Neighborhood Attribute Reduction:A Multicriterion Strategy Based on Sample Selection[J].Information,2018,9(11):282-302.
[31]HU Q H,PEDRYCZ W,YU D R,et al.Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization[J].IEEE Transactions on Systems Man and Cybernetics Part B,2009,40(1):137-150.
[32]WANG C Z,HU Q H,WANG X Z,et al.Feature Selection Based on Neighborhood Discrimination Index[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(1):2986-2999.
[33]YANG X B,YAO Y Y.Ensemble Selector for Attribute Reduction[J].Applied Soft Computing,2018,70(1):1-11.
[34]LI Z Y,YANG X B,XU S P,et al.Attribute Reduction Approach to Neighborhood Decision Agreement[J].Journal of Henan Normal University(Natural Science Edition),2017,45(5):68-73.(in Chinese)
李智远,杨习贝,徐苏平,等.邻域决策一致性的属性约简方法研究[J].河南师范大学学报(自然科学版),2017,45(5):68-73.
[35]LI Z Y,YANG X B,CHEN X J,et al.Attribute Reduction Constrained by Class-specific Approximate Quality[J].Journal of Henan Normal University(Natural Science Edition),2018,46(3):112-118.(in Chinese)
李智远,杨习贝,陈向坚,等.类别近似质量约束下的属性约简方法研究[J].河南师范大学学报(自然科学版),2018,46(3):112-118.
[1] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[2] 王馨彤, 王璇, 孙知信.
基于多尺度记忆残差网络的网络流量异常检测模型
Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network
计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[3] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[4] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[5] 方连花, 林玉梅, 吴伟志.
随机多尺度序决策系统的最优尺度选择
Optimal Scale Selection in Random Multi-scale Ordered Decision Systems
计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067
[6] 范新南, 赵忠鑫, 严炜, 严锡君, 史朋飞.
结合注意力机制的多尺度特征融合图像去雾算法
Multi-scale Feature Fusion Image Dehazing Algorithm Combined with Attention Mechanism
计算机科学, 2022, 49(5): 50-57. https://doi.org/10.11896/jsjkx.210400093
[7] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[8] 张红民, 李萍萍, 房晓冰, 刘宏.
改进YOLOv3网络模型的人体异常行为检测方法
Human Abnormal Behavior Detection Method Based on Improved YOLOv3 Network Model
计算机科学, 2022, 49(4): 233-238. https://doi.org/10.11896/jsjkx.210300251
[9] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[10] 王子茵, 李磊军, 米据生, 李美争, 解滨.
基于误分代价的变精度模糊粗糙集属性约简
Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost
计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211
[11] 王志成, 高灿, 邢金明.
一种基于正域的三支近似约简
Three-way Approximate Reduction Based on Positive Region
计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067
[12] 王栋, 周大可, 黄有达, 杨欣.
基于多尺度多粒度特征的行人重识别
Multi-scale Multi-granularity Feature for Pedestrian Re-identification
计算机科学, 2021, 48(7): 238-244. https://doi.org/10.11896/jsjkx.200600043
[13] 李艳, 范斌, 郭劼, 林梓源, 赵曌.
基于k-原型聚类和粗糙集的属性约简方法
Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets
计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
[14] 袁星星, 吴秦.
基于显著性特征和角度信息的遥感图像目标检测
Object Detection in Remote Sensing Images Based on Saliency Feature and Angle Information
计算机科学, 2021, 48(4): 174-179. https://doi.org/10.11896/jsjkx.191200027
[15] 顾兴健, 朱剑峰, 任守纲, 熊迎军, 徐焕良.
多尺度U网络实现番茄叶部病斑分割与识别
Multi-scale U Network Realizes Segmentation and Recognition of Tomato Leaf Disease
计算机科学, 2021, 48(11A): 360-366. https://doi.org/10.11896/jsjkx.201000166
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!