计算机科学 ›› 2016, Vol. 43 ›› Issue (4): 264-269.doi: 10.11896/j.issn.1002-137X.2016.04.054
梁路,黎剑,霍颖翔,滕少华
LIANG Lu, LI Jian, HUO Ying-xiang and TENG Shao-hua
摘要: 传统的数据标准化处理通常采用的是线性的变换方法,其在处理非均匀分布的数据集时,容易因局部区间内数据点间距过小导致后续的数据挖掘(尤其是基于距离的挖掘)结果不够精确。因此,为非均匀分布数据提出一种基于数据拟合的非线性变换标准化方法,该方法能够在不改变数据整体分布规律的前提下,依据统计找出对应的非线性变换函数,根据函数对各数据点的取值进行非线性放缩,将数据稠密的区间进行扩大的同时将数据稀疏的区间进行压缩,让挖掘的结果更加精确。实验采用BP(Back Propagation)神经网络、支持向量机(Support Vector Machine,SVM)、最近邻分类(K-Nearest Neighbor,KNN) 3种经典分类算法结合不同的数据集进行了挖掘,结果表明,分类的错误率有不同程度的下降,同时F1度量有所提高。
[1] Kamiran F,Calders T.Data preprocessing techniques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33 [2] Guo Xi-yue,He Ting-ting.Survey about Research on Informa-tion Extraction[J].Computer Science,2015,42(2):14-17(in Chinese) 郭喜跃,何婷婷.信息抽取研究综述[J].计算机科学,2015,2(2):14-17 [3] Wang R Y,Storey V C,Firth C P.A framework for analysis of data quality research[J].IEEE Transactions on Knowledge and Data Engineering,1995,7(4):623-640 [4] Jiawei H,Kamber M.Data mining:concepts and techniques[M].San Francisco,CA,Itd:Morgan Kaufmann,2001 [5] Weigend A S.Time series prediction:forecasting the future and understanding the past[R].Santa Fe Institute Studies in the Scie-nces of Complexity,1994 [6] Mendelsohn L.Preprocessing data for neural networks.https://www.tradertech.com/mendelsohn/library/neural-networks/preprocessing-data [7] Yu L,Wang S,Lai K K.An integrated data preparation scheme for neural network data analysis[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(2):217-230 [8] Liping Y,Yuntao P,Yishan W.Research on data normalization methods in multi-attribute evaluation[C]∥International Conference on Computational Intelligence and Software Enginee-ring,2009(CiSE 2009).IEEE,2009:1-5 [9] Pyle D.Data preparation for data mining[M].Morgan Kauf-mann,1999 [10] Uragun B,Rajan R.Developing an appropriate data normalization method[C]∥2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA).IEEE,2011,2:195-199 [11] Zhang Yu-nong,Li Ming-ming,Chen Jin-hao,et al.Solving the problem of Runge phenomenon by coefficients and order determination method[J].Computer Engineering and Applications,2013,9(3):44-49(in Chinese) 张雨浓,李名鸣,陈锦浩,等.龙格现象难题被解之系数与阶次双确定方法[J].计算机工程与应用,2013,9(3):44-49 |
No related articles found! |
|