基于线性回归和属性集成的分类算法

doi:10.11896/j.issn.1002-137X.2017.06.035

Abstract

Abstract: For the classification problems of high-dimensionality and small-sample data,the predictive accuracy of the classification model is restricted by the complexity of the high dimensional attributes.To further improve the accuracy,a classification algorithm using linear regression and attributes ensemble (LRAE) was proposed.The linear regression is utilized to construct an attribute linear classifier (ALC) for each attribute.To avoid the decrease of accuracy caused by too many ALCs,empirical loss value in the empirical risk minimization strategy is used as the evaluation criteria to select ALCs.The majority voting method is adopted to integrate ALCs.The results of experiments using gene expression data demonstrate that the accuracy of LRAE algorithm is relatively higher than that of logistic regression,support vector machine and random forest algorithms.

Key words: Linear regression,Single attribute classification,Empirical loss,Attribute ensemble,Majority voting method

QIANG Bao-hua, TANG Bo, WANG Yu-feng, ZOU Xian-chun, LIU Zheng-li, SUN Zhong-xu and XIE Wu. Classification Algorithm Using Linear Regression and Attribute Ensemble[J].Computer Science, 2017, 44(6): 212-215.

References

[1] YUAN G X,HO C H,LIN C J.Recent Advances of Large-Scale Linear Classification[J].Proceedings of the IEEE,2012,0(9):2584-2603.
[2] LIU Z W.Research on Linear Classification Algorithm Based on Combination and Optimization [D].Xi’an:Xidian University,2013.(in Chinese) 刘志伟.基于组合优化的线性分类算法研究[D].西安:西安电子科技大学,2013.
[3] JOACHIMS T.Training linear SVMs in linear time[C]∥Twelfth ACM Sigkdd International Conference on Knowledge Discovery&Data Mining.Philadelphia,USA:ACM press,2006:217-226.
[4] HSIEH C J,CHANG K W,LIN C J,et al.A dual coordinate descent method for large-scale linear SVM[C]∥ International Conference on Machine Learning.Helsinki,Finland:IEEE press,2008:1369-1398.
[5] CRAMER J S.The origins of logistic regression:02-119/4[R].Uinkeveren:Tinbergen Institute,2002.
[6] PLATT J.Sequential minimal optimization:A fast algorithm for training support vector marchines [J].Journal of Information Technology,1998,2(5):1-28.
[7] BOYD S L,VANDENBERGHE.Convex Optimization[M].Cam-bridge,UK:Cambridge University Press,2004.
[8] DIETTERICH T G.Machine learning research:Four current directions [J].AI Magazine,1997,8(4):97-136.
[9] ZHOU Z H,WU J X,TANG W.Ensembling neural networks:Many could be better than all[J].Artificial Intelligence,2002,3(1/2):239-263.
[10] ZHANG C X,ZHANG J S.A Survey of Selective EnsembleLearning Algorithms [J].Chinese Journal of Computer,2011,4(8):1399-1410.(in Chinese) 张春霞,张讲社.选择性集成学习算法综述[J].计算机学报,2011,4(8):1399-1410.
[11] FREUND Y,ROBERT E S.A decision-theoretic generalization of on-line learning and an application to boosting [J].Journal of Computer and System Sciences,1997,5(1):119-139.
[12] BREIMAN L.Bagging predictors [J].Machine Learning,1996,4(2):123-140.
[13] BREIMAN L.Random forests [J].Machine Learning,2001,5(1):5-32.
[14] ZHOU Z H.Machine Learning[M].Beijing:Tsinghua University Press,2016.(in Chinese) 周志华.机器学习[M].北京:清华大学出版社,2016.
[15] LI H.Statistical Learning Method [M].Beijing:Tsinghua University Press,2012.(in Chinese) 李航.统计学习方法[M].北京:清华大学出版社,2012.
[16] LI Y,SI J,ZHOU G J,et al.FREL:A Stable Feature Selection Algorithm [J].IEEE Trains.Neural Netw,2015,6(7):1388-1402.
[17] LU H J,AN C L.Disagreement Measure Based Ensemble of Extreme Learning Machine for Gene Expression Data Classification [J].Chinese Journal of Computer,2013,6(2):341-348.(in Chinese) 陆慧娟,安春霖.基于输出不一致测度的极限学习机集成的基因表达数据分类[J].计算机学报,2013,6(2):341-348.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Classification Algorithm Using Linear Regression and Attribute Ensemble

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0