计算机科学 ›› 2010, Vol. 37 ›› Issue (4): 231-.
• 人工智能 • 上一篇 下一篇
岳海亮,闫德勤
出版日期:
发布日期:
基金资助:
YUE Hai-liang,YAN De-qin
Online:
Published:
摘要: 连续属性离散化方法对后续阶段的机器学习和数据挖掘过程有着重要的意义。提出一种新的针对决策表的离散化算法,在该算法中,首先将信息嫡用作判断标准,从候选断点集中选择合适的断点,然后删除一些冗余的断点来优化离散结果,在删除过程中为了尽可能保证决策表分类能力不变,使用不一致率对该过程进行控制。最后选取多组实验数据,使用当前流行的分类算法—支持向量机(SVM)对离散化后的数据进行分类预测,并与其它离散算法进行对比,结果表明本算法是有效的。
关键词: 连续属性离散化,决策表,信息嫡,不一致率
Abstract: The discretization of continues attributes is always with great contribution to the followed process of machine learning or data mining. A new algorithm based on information entropy for discretization of decision table was proposed. Through inconsistency checking of decision table, we deleted some redundant cut points on the basis of preliminary discretization scheme. The experiments of classification of discreted data were performed by using SVM, and meanwhile compared with other algorithms, the presented algorithm is effective.
Key words: Discretization,Decision Lable,Information entropy,Inconsistency
岳海亮,闫德勤. 一种基于信息论的决策表连续属性离散化算法[J]. 计算机科学, 2010, 37(4): 231-. https://doi.org/
YUE Hai-liang,YAN De-qin. New Algorithm for Discretization Based on Information Entropy[J]. Computer Science, 2010, 37(4): 231-. https://doi.org/
0 / / 推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文: https://www.jsjkx.com/CN/
https://www.jsjkx.com/CN/Y2010/V37/I4/231
Cited