计算机科学 ›› 2018, Vol. 45 ›› Issue (10): 43-46.doi: 10.11896/j.issn.1002-137X.2018.10.008

• 2018 年中国粒计算与知识发现学术会议 • 上一篇    下一篇

基于分割策略的特征选择算法

焦娜   

  1. 华东政法大学信息科学与技术系 上海201620
  • 收稿日期:2018-04-17 出版日期:2018-11-05 发布日期:2018-11-05
  • 作者简介:焦 娜(1977-),女,博士,副教授,主要研究方向为人工智能、模式识别、粗糙集等,E-mail:zdx.jn@163.com(通信作者)。
  • 基金资助:
    国家社科基金(06BFX051),上海高校青年教师培养资助计划(hdzf10008)资助。

Feature Selection Algorithm Based on Segmentation Strategy

JIAO Na   

  1. Department of Information Science and Technology,East China University of Political Science and Law,Shanghai 201620,China
  • Received:2018-04-17 Online:2018-11-05 Published:2018-11-05

摘要: 特征选择是粗糙集理论中最基本、最重要的研究内容之一。已有的大多数特征选择算法对小规模数据表较为有效。在信息时代,数据表的规模越来越大,传统的特征选择方法对于大规模数据表的计算效率非常低。因此,文中引入分割策略的思想,将大规模数据表分割成若干个较小规模的数据表,然后通过合并所得结果来解决原数据表的特征选择问题。在标准数据集上的实验结果表明了所提算法的有效性。

关键词: 粗糙集理论, 分割, 核表, 冗余表, 特征选择

Abstract: Feature selection is a key issue in rough sets.The theory of rough sets is an efficient tool for reducing redundancy.At pre-sent,there are many items and features in a large table,but few methods can gain better performance for big data table.The idea of segmentation was introduced in this paper.A big data table is divided into several small tables,and selection results are joined together to solve feature selection problem of the original table.To evaluate the performance of the proposed method,this paper applied it to the benchmark data sets.Experimental results illustrate that the proposed method is effective.

Key words: Core-table, Feature selection, Redundant-table, Rough set theory, Segmentation

中图分类号: 

  • TP181
[1]PAWLAK Z.Rough sets[J].International Journal of Computer and Information Sciences,1982,11(5):341-356.
[2]WU W Z,ZHANG W X,XU Z B.Characterizating Rough Fuzzy Sets in Constructive and Axiomatic Approaches[J].Chinese Journal of Computers,2004,27(2):197-203.(in Chinese)
吴伟志,张文修,徐宗本.粗糙模糊集的构造与公理化方法[J].计算机学报,2004,27(2):197-203.
[3]SHI Z Z,CHANG L.Reasoning about Semantic Web Services with an Approach Based on Dynamic Description Logics[J].Chinese Journal of Computers,2008,31(9):1599-1611.(in Chinese)
史忠植,常亮.基于动态描述逻辑的语义Web服务推理[J].计算机学报,2008,31(9):1599-1611.
[4]WANG G Y,YAO Y Y,YU H.A Survey on Rough Set Theory and Applications[J].Chinese Journal of Computers,2009,32(7):1229-1247.(in Chinese)
王国胤,姚一豫,于洪.粗糙集理论与应用研究综述[J].计算机学报,2009,32(7):1229-1247.
[5]王国胤.Rough集理论与知识获取[M].西安:西安交通大学出版社,2001.
[6]MIAO D Q,ZHOU J,ZHANG N,et al.Research of Attribute Reduction Based on Algebraic Equations[J].Acta Electronica Sinica,2010,38(5):1021-1028.(in Chinese)
苗夺谦,周杰,张楠,等.基于代数方程组的属性约简研究[J].电子学报,2010,38(5):1021-1028.
[7]DENG D Y,LU K W,MIAO D Q,et al.Study on Entire-Granulation Rough Sets and Concept Drifting in a Knowledge System[J].Chinese Journal of Computers,2016,39(177):1-18.(in Chinese)
邓大勇,卢克文,苗夺谦,等.知识系统中全粒度粗糙集及概念漂移的研究[J].计算机学报,2016,39(177):1-18.
[8]MARIA S,MAITE L.Rough set based approaches to feature selection for Case-Based Reasoning classifiers[J].Pattern Recognition Letters,2011,32(2):280-292.
[9]TIBSHIRANI R,HASTIE T,NARAS HIMAN B,et al.Diagnosis of multiple cancer types by shrunken centroids of gene expression[C]∥Proceedings of the National Academy of Sciences.USA,2002:6567-6572.
[10]KOHAVI R,JOHN G H.Wrappers for feature subset selection[J].Artificial Intelligence,1997,97(1/2):273-324.
[11]ROKACH L.Decomposition methodology for classification tasks:a meta decomposer framework[J].Pattern Analysis and Applications,2006,9(2/3):257-271.
[12]WANG M,LIU B,TANG J H,et al.Metric learning with feature decomposition for image categorization[J].Neurocompu-ting,2010,73(10-12):1562-1569.
[13]SHE Y H,HE X L,QIAN Y H.A multiple-valued logic approach for multigranulation rough set model[J].International Journal of Approximate Reasoning,2017,82:270-284.
[14]BAZAN J G,LATKOWSKI R,SZCZUKA M.Missing template decomposition method and its implementation in rough set exploration system[C]∥Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing.Kobe,LNAI,2006:254-263.
[15]ZHANG Q Z.An Approach to Rough Set Decomposition of Incomplete Information Systems[C]∥IEEE Conference on Industrial Electronics and Applications.Harbin,IEEE,2007:2455-2460.
[16]ZHANG H Y,YANG S Y.Feature selection and approximate reasoning of large-scale set-valued decision tables based on α-dominance-based quantitative rough sets[J].Information Sciences,2017,378:328-347.
[17]GRZYMALA-BUSSE J W,GRZYMALA-BUSSE W J.Han- dling missing attribute values[M]∥Data Mining and Know-ledge Discovery Handbook.2005:37-57.
[18]GRZYMALA-BUSSE J W.Discretization of numerical attri- butes[M]∥Data Mining and Knowledge Discovery Handbook.Oxford:Oxford University Press,2002:218-225.
[1] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[2] 杨文坤, 原晓佩, 陈小锋, 郭睿.
三维激光雷达点云空间多特征分割
Spatial Multi-feature Segmentation of 3D Lidar Point Cloud
计算机科学, 2022, 49(8): 143-149. https://doi.org/10.11896/jsjkx.210300275
[3] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[4] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[5] 单永峰, 蒋锐, 徐友云, 李大鹏.
一种面向全双工多中继协作SWIPT网络的功率消耗方案
Power Consumption Scheme Oriented to Full-duplex Multi-relay Cooperative SWIPT Networks
计算机科学, 2022, 49(7): 280-286. https://doi.org/10.11896/jsjkx.210400067
[6] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[7] 康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩.
混合改进的花授粉算法与灰狼算法用于特征选择
Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection
计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135
[8] 徐汝利, 黄樟灿, 谢秦秦, 李华峰, 湛航.
基于金字塔演化策略的彩色图像多阈值分割
Multi-threshold Segmentation for Color Image Based on Pyramid Evolution Strategy
计算机科学, 2022, 49(6): 231-237. https://doi.org/10.11896/jsjkx.210300096
[9] 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇.
深度卷积神经网络图像实例分割方法研究进展
Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network
计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[10] 高心悦, 田汉民.
基于改进U-Net网络的液滴分割方法
Droplet Segmentation Method Based on Improved U-Net Network
计算机科学, 2022, 49(4): 227-232. https://doi.org/10.11896/jsjkx.210300193
[11] 储安琪, 丁志军.
基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理
Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation
计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[12] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[13] 李宗然, 陈秀宏, 陆赟, 邵政毅.
鲁棒联合稀疏不相关回归
Robust Joint Sparse Uncorrelated Regression
计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[14] 祝一帆, 王海涛, 李可, 吴贺俊.
一种高精度路面裂缝检测网络结构:Crack U-Net
Crack U-Net:Towards High Quality Pavement Crack Detection
计算机科学, 2022, 49(1): 204-211. https://doi.org/10.11896/jsjkx.210100128
[15] 张晓宇, 王彬, 安卫超, 阎婷, 相洁.
基于融合损失函数的3D U-Net++脑胶质瘤分割网络
Glioma Segmentation Network Based on 3D U-Net++ with Fusion Loss Function
计算机科学, 2021, 48(9): 187-193. https://doi.org/10.11896/jsjkx.200800099
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!