计算机科学 ›› 2020, Vol. 47 ›› Issue (3): 92-97.doi: 10.11896/jsjkx.190500180

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于特定类的区间值决策系统的分布约简

杨文静,张楠,童向荣,杜贞斌   

  1. (烟台大学数据科学与智能技术山东省高校重点实验室 山东 烟台264005)
    (烟台大学计算机与控制工程学院 山东 烟台264005)
  • 收稿日期:2019-05-31 出版日期:2020-03-15 发布日期:2020-03-30
  • 通讯作者: 张楠(zhangnan0851@163.com)
  • 基金资助:
    国家自然科学基金(61572418,61572419,61873117,61403329);山东省自然科学基金(ZR2018BA004,ZR2016FM42)

Class-specific Distribution Preservation Reduction in Interval-valued Decision Systems

YANG Wen-jing,ZHANG Nan,TONG Xiang-rong,DU Zhen-bin   

  1. (Key Lab for Data Science and Intelligence Technology of Shandong Higher Education Institutes, Yantai University, Yantai, Shandong 264005, China)
    (School of Computer and Control Engineering, Yantai University, Yantai, Shandong 264005, China)
  • Received:2019-05-31 Online:2020-03-15 Published:2020-03-30
  • About author:YANG Wen-jing,born in 1996,postgraduate.Her main research interests include rough set theory,data mining and machine learning. ZHANG Nan,born in 1979,Ph.D,lecturer,master supervisor.His main research interests include rough set theory,cognitive informatics and artificial intelligence.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61572418, 61572419, 61873117, 61403329) and Shandong Provincial Natural Science Foundation (ZR2018BA004, ZR2016FM42).

摘要: 在粗糙集理论中,属性约简是重要的研究内容之一。通过属性约简可以去除冗余属性,求得保持决策系统某种分类能力不变的最小属性子集。分布约简保持决策系统中所有决策类的分布不变,但针对所有决策类的分布约简在实际问题中可能是不必要的。针对以上问题,文中给出了区间值决策系统中基于α-相容关系的特定类分布约简的概念,证明了特定类分布约简的相关定理,构造了特定类分布约简对应的差别矩阵,提出了基于差别矩阵的特定类的分布约简算法(CDRDM),并分析了特定类的分布约简算法和全局分布约简算法(DRDM)构造的差别矩阵中非空元素的集合之间的关系。实验中选取了6组UCI数据集,引入了区间参数,当区间参数为1.2、阈值为0.5时,比较了DRDM算法和3种不同决策类下的CDRDM算法的约简结果和平均约简长度,并且当区间参数分别为1.2和1.6、阈值分别为0.4和0.5时,给出了DRDM算法和两种不同决策类下的CDRDM算法的约简时间随着对象数目和属性数目的变化情况。实验结果表明,特定类分布约简算法针对不同决策类的约简结果可能不同,并且当决策系统中的决策类数量大于1时,特定类分布约简算法的平均约简长度小于或等于全局分布约简算法的平均约简长度,特定类分布约简算法针对不同的决策类在约简效率上有不同程度的改进。

关键词: 差别矩阵, 粗糙集, 分布约简, 区间值决策系统, 特定类约简

Abstract: Attribute reduction is one of the important areas in rough set theory.A minimal set of attributes which preserves a certain classification ability in decision tables is solved through a process of attribute reduction,and the process is to remove the redundant feature attributes and select the useful feature subset.A distribution reduct can preserve the distribution of all decision classes in decision tables,but the reducts of all decision classes may not be necessary in the practice.To solve the above problems,this paper proposed the concept of class-specific distribution preservation reduction based on α-tolerance relations in interval-valued decision systems.Some theorems of class-specific distribution preservation reduction were proved and the relevant discerni-bility matrix of class-specific distribution preservation reduction was constructed.And then this paper proposed class-specific distribution preservation reduction algorithm based on discernibility matrices (CDRDM),and analyzed the relationship between the set of non-empty elements in the discernibility matrices constructed by class-specific distribution preservation reduction algorithm and distribution preservation reduction algorithm (DRDM).In the experiment,six sets of UCI data sets were selected and the interval parameter was introduced.When the interval parameter is 1.2 and threshold is 0.5,the results and average length of reducts in DRDM algorithm and CDRDM algorithm were compared.When the interval parameter is 1.2 and 1.6 and threshold is 0.4 and 0.5 respectively,the changes of reduction time of DRDM algorithm and CDRDM algorithm with the number of objects and attributes were given.Moreover,the experiment indicates that CDRDM algorithm has different results for different decision classes.And when there are more than one decision class in decision tables,the average length of reducts of CDRDM algorithm is less than or equal to the average length of reducts of DRDM algorithm,the reduction efficiency based on different decision classes in CDRDM algorithm is improved in varying degrees.

Key words: Class-specific attribute reduction, Discernibility matrix, Distribution reduction, Interval-valued decision system, Rough set

中图分类号: 

  • TP181
[1]PAWLAK Z.Rough sets[J].International Journal of Computer and Information Sciences,1982,11(5):341-356.
[2]PAWLAK Z.Rough sets:Theoretical aspects of reasoning about data[M].Boston:Kluwer Academic Publishers,1992.
[3]MIAO D Q,HU G R.A Heuristic algorithm for reduction of knowledge[J].Journal of Computer Research and Development,1999,36(6):681- 684.
[4]QIAN Y H,LIANG J Y,PEDRYCZ W,et al.Positive approximation:an accelerator for attribute reduction in rough set theory[J].Artificial Intelligence,2010,174(9):597-618.
[5]QIAN Y H,LIANG X Y,WANG Q,et al.Local rough set:a solution to rough data analysis in big data[J].International Journal of Approximate Reasoning,2018,97:38-63.
[6]HU Q H,ZHANG L J,ZHOU Y C,et al.Large-Scale Multimodality Attribute Reduction With Multi-Kernel Fuzzy Rough Sets[J].IEEE Transactions on Fuzzy Systems,2018,26(1):226-238.
[7]JING Y G,LI T R,FUJITA H,et al.An incremental attribute reduction method for dynamic data mining[J].Information Sciences,2018,465:202-218.
[8]SKOWRON A,RAUSZER C.The discernibility matrices and functions in information systems[M]∥SLOWIHSKI R.Intelligent Decision Support.Dordrecht:Springer,1992:331-362.
[9]XU W H,ZHANG X Y,ZHONG J M,et al.Heuristic Algo- rithm for Attributes Reduction in Ordered Information Systems[J].Computer Engineering,2010,36(17):69-71.
[10]LEUNG Y,FISCHER M,WU W Z,et al.A rough set approach for the discovery of classification rules in interval-valued information systems[J].International Journal of Approximate Reasoning,2008,47(2):233-246.
[11]YANG X B,QI Y,YU D J,et al.α-Dominance relation and rough sets in interval-valued information systems[J].Information Sciences,2015,294(5):334-347.
[12]CHEN Z C,QIN K Y.Attribute Reduction of Interval-valued Information System Based on Variable Precision Relation[J].Computer science,2009,36(3):163-166.
[13]ZHANG N,MIAO D Q,YUE X D.Approaches to knowledge reduction in interval-valued information systems[J].Journal of Computer Research and Development,2010,47(8):1362-1371.
[14]SUN B Z,MA W M,GONG Z T.Dominance-based rough set theory over interval-valued information systems[M].John Wiley &Sons,2014.
[15]DAI J H,HU H,ZHENG G J,et al.Attribute reduction in interval-valued information systems based on information entropies[J].Frontiers of Information Technology and Electronic Engineering,2016,17(9):919-928.
[16]PINEDA-BAUTISTA B B,CARRASCO-OCHOA J A,MAR- TÍNEZ-TRINIDAD J F.General framework for class-specific feature selection[J].Expert Systems with Applications,2011,38(8):10018-10024.
[17]YAO Y Y,ZHANG X Y.Class-specific attribute reducts in rough set theory[J].Information Sciences,2017,418(38):601-618.
[18]LIU G L,HUA Z,ZOU J Y.Local attribute reductions for decision tables[J].Information Sciences,2017,422:204-217.
[19]YIN J L,ZHANG N,ZHAO L W,et al.Local Attribute Reduction in Interval-valued Decision Systems[J].Computer Science,2018,45(7):178-185.
[20]ZHANG N,XU X,TONG X R,et al.Distribution reduction in inconsistent interval-valued decision systems[J].Computer Science,2017,44(9):78-82,104.
[21]ZHANG X,MEI C L,CHEN D G,et al.Multi-confidence rule acquisition and confidence-preserved attribute reduction in interval-valued decision systems[J].International Journal of Approxi-mate Reasoning,2014,55(8):1787-1804.
[1] 程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝.
基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法
Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory
计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202
[2] 许思雨, 秦克云.
基于剩余格的模糊粗糙集的拓扑性质
Topological Properties of Fuzzy Rough Sets Based on Residuated Lattices
计算机科学, 2022, 49(6A): 140-143. https://doi.org/10.11896/jsjkx.210200123
[3] 方连花, 林玉梅, 吴伟志.
随机多尺度序决策系统的最优尺度选择
Optimal Scale Selection in Random Multi-scale Ordered Decision Systems
计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067
[4] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[5] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[6] 王子茵, 李磊军, 米据生, 李美争, 解滨.
基于误分代价的变精度模糊粗糙集属性约简
Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost
计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211
[7] 王志成, 高灿, 邢金明.
一种基于正域的三支近似约简
Three-way Approximate Reduction Based on Positive Region
计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067
[8] 薛占熬, 侯昊东, 孙冰心, 姚守倩.
带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法
Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes
计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042
[9] 李艳, 范斌, 郭劼, 林梓源, 赵曌.
基于k-原型聚类和粗糙集的属性约简方法
Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets
计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
[10] 薛占熬, 孙冰心, 侯昊东, 荆萌萌.
基于多粒度粗糙直觉犹豫模糊集的最优粒度选择方法
Optimal Granulation Selection Method Based on Multi-granulation Rough Intuitionistic Hesitant Fuzzy Sets
计算机科学, 2021, 48(10): 98-106. https://doi.org/10.11896/jsjkx.200800074
[11] 薛占熬, 张敏, 赵丽平, 李永祥.
集对优势关系下多粒度决策粗糙集的可变三支决策模型
Variable Three-way Decision Model of Multi-granulation Decision Rough Sets Under Set-pair Dominance Relation
计算机科学, 2021, 48(1): 157-166. https://doi.org/10.11896/jsjkx.191200175
[12] 桑彬彬, 杨留中, 陈红梅, 王生武.
优势关系粗糙集增量属性约简算法
Incremental Attribute Reduction Algorithm in Dominance-based Rough Set
计算机科学, 2020, 47(8): 137-143. https://doi.org/10.11896/jsjkx.190700188
[13] 陈玉金, 徐吉辉, 史佳辉, 刘宇.
基于直觉犹豫模糊集的三支决策模型及其应用
Three-way Decision Models Based on Intuitionistic Hesitant Fuzzy Sets and Its Applications
计算机科学, 2020, 47(8): 144-150. https://doi.org/10.11896/jsjkx.190800041
[14] 周俊丽, 管延勇, 徐法升, 王洪凯.
覆盖近似空间中的核及其性质
Core in Covering Approximation Space and Its Properties
计算机科学, 2020, 47(6A): 526-529. https://doi.org/10.11896/JsJkx.190600003
[15] 张琴, 陈红梅, 封云飞.
一种基于粗糙集和密度峰值的重叠社区发现方法
Overlapping Community Detection Method Based on Rough Sets and Density Peaks
计算机科学, 2020, 47(5): 72-78. https://doi.org/10.11896/jsjkx.190400160
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!