Computer Science ›› 2014, Vol. 41 ›› Issue (2): 111-113.

Previous Articles     Next Articles

Correlated Rules Based Associative Classification for Imbalanced Datasets

HUANG Zai-xiang,ZHOU Zhong-mei and HE Tian-zhong   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Many studies have shown that associative classification is a promising classification method.However,most algorithms of associative classifications may not achieve high classification performance on imbalanced datasets because they generate rules based on the “support-confidence” framework.The confidence (support) tends to bias the majority class in imbalanced datasets.As a result,these instances with minority class may be misclassified.We proposed a new associative classification approach called CRAC (Correlated Rules based Associative Classification for Imbalanced Datasets).First,we mine frequent and mutual associative itemsets for classification.Therefore,we will generate small set of high-quality rules.Second,CRAC only select the rule with largest lift as a CAR among all rules with that frequent and associative itemset as condition.As a result,the antecedent and the consequent of the rules CRAC generated are positively correlated.Finally,we rank rules according to a new metric which integrates lift,support and Complement Class Support (CCS).So,we are likely to use rules with positively correlation to prediction the minority class.Our experiments on fifteen UCI data sets show that our approach is an effective classification technique for both balance and imbalanced datasets,and has better average classification accuracy in comparison with CBA.

Key words: Data mining,Associative classification,Imbalance datasets,Correlated rules

[1] Liu B,Hsu W,Ma Y.Integrating classification and associationrule mining[C]∥Proc of the 4th International Conference on Knowledge Discovery and Data Mining (KDD’98).1998:80-86
[2] Li W,Han J,Pei J.CMAR:Accurate and efficient classification based on multiple class-association rules[C]∥Proc of the 1st International Conference on Data Mining.2001:369-376
[3] Yin X,Han J.CPAR:classification based on predictive association rules[C]∥Proc of the SIAM International Conference on Data Mining (SDM’03).2003:331-335
[4] Dong G,Zhang X,Wong L,et al.CAEP:Classification by aggregating emerging patterns[C]∥Discovery Science.Springer Berlin Heidelberg,1999:30-42
[5] Wang J,Karypis G.HARMONY:Efficiently mining the bestrules for classification[C]∥ Proc.of SDM.2005:205-216
[6] Quinlan J R.C4.5:programs for machine learning[M].Morgan kaufmann,1993
[7] Verhein F,Chawla S.Using significant,positively associated and relatively class correlated rules for associative classification of imbalanced datasets[C]∥Seventh IEEE International Confe-rence on Data Mining,2007,ICDM 2007.IEEE,2007:679-684
[8] Arunasalam B,Chawla S.CCCS:a top-down associative classifier for imbalanced class distribution[C]∥Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2006:517-522
[9] Omiecinski E R.Alternative interest measures for mining associa-tions in databases [J].IEEE Transactions on Knowledge and Data Engineering,2003,15(1):57-69
[10] Zhao Y,Karypis G.Criterion functions for document clustering:Experiments and analysis [Z].Machine Learning,2001
[11] Agrawal R,Srikant R.Fast algorithms for mining associationrules[C]∥Proc of the 20th International Conference on Very Large Data Bases (VLDB’94).1994:487-499
[12] Thabtah F A,Cowling P,Peng Y.MMAC:A New Multi-class,Multi-label Associative Classification Approach[C]∥Proc of the 4th International Conference on Data Mining (ICDM’04).2004:217-224
[13] CBA:http://www.comp.nus.edu.sg/dm2/p-download.html

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!