计算机科学 ›› 2017, Vol. 44 ›› Issue (Z11): 129-132.doi: 10.11896/j.issn.1002-137X.2017.11A.026

• 智能计算 • 上一篇    下一篇

基于属性重要度的决策树算法

王蓉,刘遵仁,纪俊   

  1. 青岛大学数据科学与软件工程学院 青岛266071,青岛大学计算机科学技术学院 青岛266071,青岛大学计算机科学技术学院 青岛266071
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金项目(61503208)资助

Decision Tree Algorithm Based on Attribute Significance

WANG Rong, LIU Zun-ren and JI Jun   

  • Online:2018-12-01 Published:2018-12-01

摘要: 传统的ID3决策树算法存在属性选择困难、分类效率不高、抗噪性能不强、难以适应大规模数据集等问题。针对该情况,提出一种基于属性重要度及变精度粗糙集的决策树算法,在去除噪声数据的同时保证了决策树的规模不会太庞大。利用多个UCI标准数据集对该算法进行了验证,实验结果表明该算法在所得决策树的规模和分类精度上均优于ID3算法。

关键词: 决策树,属性重要度,变精度粗糙集,属性约简,数据挖掘

Abstract: The traditional ID3 decision tree algorithm is difficult in selecting attribute,its classification efficiency is not high,and anti-noise performance is not strong,so it is difficult to adapt to large-scale data set and other issues.Aiming at this situation,a decision tree algorithm based on attribute significance and variable precision rough set was proposed to ensure that the tree size is not too large while removing the noise data.The algorithm was validated by using multiple UCI standard data sets.The experimental results show that the algorithm is superior to the ID3 algorithm in the scale and classification accuracy of the decision tree.

Key words: Decision tree,Attribute significance,Variable precision rough set,Attribute reduction,Data mining

[1] 梁凤兰.优化决策树改进挖掘算法仿真[J].计算机仿真,2013,30(11):264-267.
[2] 张棪,曹健.面向大数据分析的决策树算法[J].计算机科学,2016,43(6A):374-379.
[3] QUINLAN J R.Induction of Decision Trees[J].Machine Ler-ning,1986,1(1):81-106.
[4] QUINLAN J R.Simplifying Decision Trees[J].InternationalJournal of Man-machine Studies,1987,7(3):221-234.
[5] 洪家荣,丁明锋,李星原.一种新的决策树归纳学习算法[J].计算机学报,1995,18(6):470-474.
[6] 刘小虎,李生.决策树的优化算法[J].软件学报,1998,9(10):797-800.
[7] WANG S Q,WEI J M,YOU J P,et al.A VPRSM based approach for inducing decision trees[C]∥RSKT2006.Chongqing,China,2006:421-429.
[8] 洪雪飞,徐维祥.基于变精度粗糙集的决策树改进方法[J].计算机工程与应用,2009,45(13):163-165.
[9] 丁春荣,李龙澍.变精度粗糙集模型在决策树构造中的应用[J].计算机工程与科学,2010,32(7):86-88.
[10] 鄂旭,任骏原,毕嘉娜,等.基于粗糙变精度的食品安全决策树研究[J].计算机技术与发展,2014,24(1):242-245.
[11] BARANAUSKAS J A.The number of classes as a source for instability of decision tree algorithms in high dimensional datasets[J].Springer,2015,43(2):301-310.
[12] LIANG C Q,ZHANG Y,SHI P,et al.Learning accurate very fast decision trees from uncertain data streams[J].Taylor & Francis,2015,46(16):3032-3050.
[13] 王婧,王兴伟,赵悦.基于变精度粗糙集决策树垃圾邮件过滤[J].系统仿真学报,2016,28(3):705-710.
[14] APNIK V.The nature of statistical learning theroy[M].New York:Springer,1995.
[15] PAWLAK Z,SO-WINSKI R.Rough set approach to multi-attribute decision analysis[J].European Journal of Operational Research,1994,72(3):443-459.
[16] LIU Y,HUANG W,JIANG Y,et al.Quick attribute reduct algorithm for neighborhood rough set model[J].Information Sciences,2014,271(7):65-81.
[17] 娄畅,刘遵仁,郭功振.基于块集的邻域粗糙集的快速约简算法[J].计算机科学,2014,41(S2):337-339.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!