Computer Science ›› 2016, Vol. 43 ›› Issue (4): 264-269.doi: 10.11896/j.issn.1002-137X.2016.04.054

Previous Articles     Next Articles

Nonlinear Normalization for Non-uniformly Distributed Data

LIANG Lu, LI Jian, HUO Ying-xiang and TENG Shao-hua   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Traditional normalization method for continuous attributes is usually a linear transformation.When using li-near normalization to deal with some non-uniform datasets,it’s easy to cause the subsequent data mining (particularly some mining methods based on distance) results are inaccurate enough for the interval of each data point in the local space is too small .This paper suggested a nonlinear normalization based on data fitting,and we could find out the corresponding nonlinear transformation function in the premise of not changing the distribution rules of data.According to the function,we could nonlinearly zoom the data interval,expand the interval of dense data and shrink the interval of sparse data at the same time.It can make the data mining more accurate.We used the neural network,SVM and KNN combining with different data set to test.The results show that the error rate decreases and the F1 measure increases at the same time.

Key words: Non-uniform distribution,Nonlinear normalization,Data preprocessing

[1] Kamiran F,Calders T.Data preprocessing techniques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33
[2] Guo Xi-yue,He Ting-ting.Survey about Research on Informa-tion Extraction[J].Computer Science,2015,42(2):14-17(in Chinese) 郭喜跃,何婷婷.信息抽取研究综述[J].计算机科学,2015,2(2):14-17
[3] Wang R Y,Storey V C,Firth C P.A framework for analysis of data quality research[J].IEEE Transactions on Knowledge and Data Engineering,1995,7(4):623-640
[4] Jiawei H,Kamber M.Data mining:concepts and techniques[M].San Francisco,CA,Itd:Morgan Kaufmann,2001
[5] Weigend A S.Time series prediction:forecasting the future and understanding the past[R].Santa Fe Institute Studies in the Scie-nces of Complexity,1994
[6] Mendelsohn L.Preprocessing data for neural networks.https://www.tradertech.com/mendelsohn/library/neural-networks/preprocessing-data
[7] Yu L,Wang S,Lai K K.An integrated data preparation scheme for neural network data analysis[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(2):217-230
[8] Liping Y,Yuntao P,Yishan W.Research on data normalization methods in multi-attribute evaluation[C]∥International Conference on Computational Intelligence and Software Enginee-ring,2009(CiSE 2009).IEEE,2009:1-5
[9] Pyle D.Data preparation for data mining[M].Morgan Kauf-mann,1999
[10] Uragun B,Rajan R.Developing an appropriate data normalization method[C]∥2011 10th International Conference on Machine Learning and Applications and Workshops (ICMLA).IEEE,2011,2:195-199
[11] Zhang Yu-nong,Li Ming-ming,Chen Jin-hao,et al.Solving the problem of Runge phenomenon by coefficients and order determination method[J].Computer Engineering and Applications,2013,9(3):44-49(in Chinese) 张雨浓,李名鸣,陈锦浩,等.龙格现象难题被解之系数与阶次双确定方法[J].计算机工程与应用,2013,9(3):44-49

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!