Computer Science ›› 2019, Vol. 46 ›› Issue (2): 196-201.doi: 10.11896/j.issn.1002-137X.2019.02.030

• Artificial Intelligence • Previous Articles     Next Articles

Randomization of Classes Based Random Forest Algorithm

GUAN Xiao-qiang, PANG Ji-fang, LIANG Ji-ye   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University),Ministry of Education,Taiyuan 030006,China
  • Received:2018-09-07 Online:2019-02-25 Published:2019-02-25

Abstract: Random forest is a commonly used classification method in the field of data mining and machine learning,which has become a research focus of scholars at home and abroad,and has been widely applied to various practical problems.The traditional random forest methods do not consider the influence of the number of classes on the classification effect,and neglect the correlation between base classifiers and classes,limiting the performance of the random forest in dealing with multi-class classification problems.In order to solve the problem better,combined with the characteristics of multi-class classification problem,this paper proposed a randomization of classes based random forest algorithm (RCRF).From the perspective of classes,the randomization of classes is added on the basis of two kinds of traditional randomizations of random forest,and the corresponding base classifiers with different emphasis are designed for diffe-rent classes.The structures of the decision tree generated by the base classifier are different because different classifiers focus on different classes,which can not only guarantee the performance of the single base classifier,but also further increase the diversity of base classifier.In order to verify the validity of the proposed algorithm,RCRF is compared with other algorithms on 21 data sets in UCI database.The experiment is carried out from two aspects.On the one hand,the accuracy,F1-measure and Kappa coefficient are used to verify the performance of RCRF algorithm.On the other hand,the κ-error diagram is used to compare and analyze various algorithms from the perspective of diversity.Experimental results show that the proposed algorithm can effectively improve the overall performance of the integrated model and has obvious advantages in dealing with multi-class classification problems.

Key words: Diversity, Multi-class classification problems, Random forest, Randomization of classes

CLC Number: 

  • TP181
[1]BREIMAN L.Random Forests [J].Machine Learning,2001,45(1):5-23.
[2]FERNANDEZ-DELGADO M,CERNADAS E,BARRO S,et al. Do we need hundreds of classifiers to solve real world classification problems [J].Journal of Machine Learning Research,2014,15(1):3133-3181.
[3]MEHER P K,SAHU T K,RAO A R.Identification of species based on DNA barcode using k-mer feature vector and random forest classifier [J].Gene,2016,592(2):316-324.
[4]JOG A,CARASS A,ROY S,et al.Random forest regression for magnetic resonance image synthesis [J].Medical Image Analysis,2017,35:475-488.
[5]WANG S,LIU J,BI Y Y,et al.Automatic recognition of breast gland based on two-step clustering and random forest [J].Computer Science,2018,45(3):247-252.(in Chinese)
王帅,刘娟,毕姚姚,等.基于两步聚类和随机森林的乳腺腺管自动识别方法 [J].计算机科学,2018,45(3):247-252.
[6]FANELLI G,DANTONE M,GALL J,et al.Random forests for real time 3D face analysis [J].International Journal of Computer Vision,2013,101(3):437-458.
[7]GALL J,YAO A,RAZAVI N,et al.Hough forests for object detection,tracking,and action recognition [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(11):2188-2202.
[8]GEURTS P,ERNST D,WEHENKEL L.Extremely randomized trees [J].Machine Learning,2006,63(1):3-42.
[9]RODRIGUEZ J J,KUNCHEVA L I,ALONSO C J.Rotation forest:a new classifier ensemble method [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2006,28(10):1619-1630.
[10]ZHANG L,SUGANTHAN P N.Random forests with ensemble of feature spaces [J].Pattern Recognition,2014,47(10):3429-3437.
[11]ABELLÁN J,MANTAS C J,CASTELLANO J G.A random forest approach using imprecise probabilities [J].Knowledge- Based Systems,2017,134:72-84.
[12]WANG Y,XIA S T,TANG Q,et al.A novel consistent random forest framework:bernoulli random forests [J].IEEE Transactions on Neural Networks & Learning Systems,2018,29(8):3510-3523.
[13]YE Y,WU Q,HUANG J Z,et al.Stratified sampling for feature subspace selection in random forests for high dimensional data [J].Pattern Recognition,2013,46(3):769-787.
[14]XIA J,LI L,LI L,et al.Adjusted weight voting algorithm for random forests in handling missing values [J].Pattern Recognition,2017,69(C):52-60.
[15]HU C,CHEN Y,HU L,et al.A novel random forests based class incremental learning method for activity recognition [J].Pattern Recognition,2018,78:277-290.
[16]BREIMAN L.Bagging predictors [J].Machine Learning,1996,24(2):123-140.
[17]HO T K.The random subspace method for constructing decision forests [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(8):832-844.
[18]DEMSAR J.Statistical comparisons of classifiers over multiple data sets [J].Journal of Machine Learning Research,2006,7(1):1-30.
[19]MARGINEANTU D D,DIETTERICH T G.Pruning adaptive boosting [C]∥Fourteenth International Conference on Machine Learning.Morgan Kaufmann Publishers Inc.,1997:211-218.
[1] GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang. Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features [J]. Computer Science, 2022, 49(7): 40-49.
[2] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[3] QUE Hua-kun, FENG Xiao-feng, LIU Pan-long, GUO Wen-chong, LI Jian, ZENG Wei-liang, FAN Jing-min. Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection [J]. Computer Science, 2022, 49(6A): 790-794.
[4] WANG Wen-qiang, JIA Xing-xing, LI Peng. Adaptive Ensemble Ordering Algorithm [J]. Computer Science, 2022, 49(6A): 242-246.
[5] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[6] CHEN Zhuang, ZOU Hai-tao, ZHENG Shang, YU Hua-long, GAO Shang. Diversity Recommendation Algorithm Based on User Coverage and Rating Differences [J]. Computer Science, 2022, 49(5): 159-164.
[7] ZHANG Xiao-qing, FANG Jian-sheng, XIAO Zun-jie, CHEN Bang, Risa HIGASHITA, CHEN Wan, YUAN Jin, LIU Jiang. Classification Algorithm of Nuclear Cataract Based on Anterior Segment Coherence Tomography Image [J]. Computer Science, 2022, 49(3): 204-210.
[8] LIU Zhen-yu, SONG Xiao-ying. Multivariate Regression Forest for Categorical Attribute Data [J]. Computer Science, 2022, 49(1): 108-114.
[9] LIU Yi, MAO Ying-chi, CHENG Yang-kun, GAO Jian, WANG Long-bao. Locality and Consistency Based Sequential Ensemble Method for Outlier Detection [J]. Computer Science, 2022, 49(1): 146-152.
[10] YANG Xiao-qin, LIU Guo-jun, GUO Jian-hui, MA Wen-tao. Full Reference Color Image Quality Assessment Method Based on Spatial and Frequency Domain Joint Features with Random Forest [J]. Computer Science, 2021, 48(8): 99-105.
[11] ZHENG Jian-hua, LI Xiao-min, LIU Shuang-yin, LI Di. Improved Random Forest Imbalance Data Classification Algorithm Combining Cascaded Up-sampling and Down-sampling [J]. Computer Science, 2021, 48(7): 145-154.
[12] CAO Yang-chen, ZHU Guo-sheng, QI Xiao-yun, ZOU Jie. Research on Intrusion Detection Classification Based on Random Forest [J]. Computer Science, 2021, 48(6A): 459-463.
[13] ZHOU Gang, GUO Fu-liang. Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data [J]. Computer Science, 2021, 48(6A): 250-254.
[14] LI Na-na, WANG Yong, ZHOU Lin, ZOU Chun-ming, TIAN Ying-jie, GUO Nai-wang. DDoS Attack Random Forest Detection Method Based on Secondary Screening of Feature Importance [J]. Computer Science, 2021, 48(6A): 464-467.
[15] XU Jia-qing, HU Xiao-yue, TANG Fu-qiao, WANG Qiang, HE Jie. Detecting Blocking Failure in High Performance Interconnection Networks Based on Random Forest [J]. Computer Science, 2021, 48(6): 246-252.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!