Computer Science ›› 2019, Vol. 46 ›› Issue (11A): 194-198.

• Data Science • Previous Articles     Next Articles

Multilayer Perceptron Classification Algorithm Based on Spectral Clusteringand Simultaneous Two Sample Representation

LIU Shu-dong, WEI Jia-min   

  1. (School of Information and Security Engineering,Zhongnan University of Economics and Law,Wuhan 430073,China)
  • Online:2019-11-10 Published:2019-11-20

Abstract: Classification learning from imbalanced datasets is always one of hot topics in data mining and machine lear-ning domains.Data-level,algorithm-level and ensemble solutions are three main methods so far for addressing imba-lanced learning.Undersmapling,which is one of data-level solutions,is widely utilized in many imbalanced learning scenarios.However,its drawback is discarding potentially useful majority data instances.In this paper,spectral clustering was introduced to take sample of the majority class instances so as to build simultaneous two sample representation.Firstly,all majority class instances are divided into many different clusters by spectral clustering analysis,different numbers of representative samples are extracted from different clusters according to the size of each cluster and the average distance between the minority class instances are generated simultaneous and each cluster,then two sample representation with the extracted instances are generated simultaneous from clusters and the minority class instances.The proposed method not only alleviates the issue of data explosion in simultaneous two sample representation,but also avoids the loss of useful information in random sampling.Finally,several experiments certificate its validity on nine groups of datasets from UCI.

Key words: Classification, Imbalanced learning, Multilayer perceptron, Spectral clustering, Under-sampling

CLC Number: 

  • TP311
[1]PROBOST F.Machine learning from imbalanced data set 101[C]∥Proceedings of Workshop on Learning from Imbalanced Data Set (AAAI’00).Palo Alto,CA:AAAI,2000:1-3.
[2]CHAWLA N V,JAPKOWICZ N,KOLCZ A.Editorial:specialissue on learning from imbalanced data sets[J].SIGKDD Explorations Special Issue on Learning from Imbalanced Datasets,2004,6(1):1-6.
[3]GALAR M,FERNANDEZ A,BARRENCHEA E,et al.A review on ensembles for the class imbalance problem:Bagging-,Boosting-,and hybrid-based approaches[J].IEEE Transaction on Systems,Man and Cybernetics,2012,42(4):463-484.
[4]KRAWCZYK B.Learning from imbalanced data:open challenge and future directions[J].Progress in Artificial Intelligence,2016,5(4):1-12.
[5]ROY A,CRUZ R M O,CAVALCANI G D C.A study on combining dynamic selection and data preprocessing for imbalanced learning[J].Neurocom-puting,2018,286:179-192.
[6]GUO H,LI Y,JENNIFER S,et al.Learning from class-imba-lanced data:review of methods and applications[J].Expert Systems with Applications,2017,73:220-239.
[7]YANG Q,WU X.10 challenging problems in data mining research[J].International Journal of Information Technology and Decision Making,2006,5(4):597-604.
[8]FERNANDEZ A,RIO S,CHAWLA N V,et al.An insight into imbalanced big data classification:outcomes and challenges[J].Complex Intelligent Systems,2017,3(2):105-120.
[9]GUERMAZI R,CHAABANE I,HAMMAMI M.AECID:asymmetric entropy for classifying imbalanced data[J].Information Sciences,2018,467:373-397.
[10]WU F,JING X,SHIN S,et al.Multiset feature learning for highly imbalanced data classification[C]∥Proceedings of the thirty-first AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI,2017:1583-1589.
[11]LOYOLA-GONZALEZ O,MARTINEZ-TRINIDAD J F,CARRASCO-OCHOA J A.Study of the impact of resampling methods for contrast pattern based classifiers in imbalanced databases[J].Neurocomputing,2016,175:935-947.
[12]LIN C,HSIEH T,LIN Y,et al.Minority Oversampling in Kernel Adaptive Subspaces for Class Imbalanced Datasets[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(5):950-962.
[13]SHAHEE S A,ANANTHAKUMAR U.An adaptive oversampling technique for imbalanced datasets[C]∥Proceedings of IEEE International Conference on Data Mining (ICDM’18).NJ:IEEE,2018:1-16.
[14]LIN W,TSAI C,HU Y,et al.Clustering-based undersampling in class-imbalanced data[J].Information Sciences,2017,409/410:17-26.
[15]LI F,ZHANG X,ZHANG X,et al.Cost-sensitive and hybrid-attribute measure multi-decision tree over imbalanced data sets[J].Information Sciences,2018,422:242-256.
[16]DECHERCHI S,ROCCHIA W.Import vector domain descrip-tion:a kernel logistic one-class learning algorithm[J].IEEE Transactions on Neural Networks and Learning Systems,2017,28(7):1722-1729.
[17]FERNANDEZ-FRANCOS D,FONTENLA-ROMERO O,ALONSO-BETANZOS A.One-class convex hull-based algorithm for classification in distributed environments [J].IEEE Transactions on Systems,Man and Cybernetics,2017,99:1-11.
[18]SUN J,SHAO J,HE C.Abnormal event detection for video surveillance using deep one-class learning[J].Multimedia Tools and Applications,2017,3:1-15.
[19]ERFANI S M,REJASEGARAR S,KARUNA-SEKERA S,et al.High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning[J].Pattern Recognition,2016,58(C):121-134.
[20]FERDOWSI Z,GHANI R,SETTIMI R.Online active learning with imbalanced Classes[C]∥Proceedings of IEEE 13th International Conference on Data Mining (ICDM’13),NJ:IEEE,2013:1043-1048.
[21]ZHANG X,YANG T,SRINIVASAN P.Online asymmetric active learning with imbalanced data[C]∥Proceedings of 22th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’16).New York:ACM.2016:2055-2064.
[22]RAMIREZ-LOAIZA M,SHARMA M,KUMAR G,et al.Active learning:An empirical study of common baselines[J].Data Mi-ning and Knowledge Discovery,2017,31:287-313.
[23]ZHANG Y,ZHAO P,CAO J,et al.Online adaptive asymmetric active learning for budgeted imbalanced data[C]∥Proceedings of 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’18).New York:ACM.2018:2768-2777.
[24]LI K,KONG X,LU Z.Boosting weighted ELM for imbalanced learning[J].Neurocomputing,2014,128:15-21.
[25]YU H,SUN C,YANG X,et al.ODC-ELM:optimal decisionoutputs compensation-based extreme learning machine for classifying imbalanced data[J].Knowledge-Based Systems,2016,92:55-70.
[26]DING S,MIRZA B,LIN Z,et al.Kernel based online learning for imbalance multi- class classification[J].Neurocomputing,2018,277:139-148.
[27]DUMPALA S H,CHAKRABORTY R,KOPPARAPU SK.A novel data representation for effective learning in class imbalanced scenarios[C]∥Proceedings of the Twenty-seventh International Joint Conference on Artificial Intelligence.2018:2100-2106.
[28]贾洪杰,丁世飞,史忠植.求解大规模谱聚类的近似加权核k-means算法[J].软件学报,2015,26(11):2836-2846.
[29]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority oversampling technique[J].Journal of Artificial Intelligence Research,2002,16(1):321-357.
[30]HART P.The condensed nearest neighbor rule [J].IEEETransactions on Information Theory,1968,14:515-516.
[31]TANG Y,ZHANG Y,CHAWLA N V,et al.SVMs modeling for highly imbalanced classification [J].IEEE Transactions on Systems,Man,and Cybernetics,2009,39(1):281-288.
[32]GALAR M,FERNANDEZ A,BARRENECHEA E,et al.Eusboost:Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling [J].Pattern Recognition,2013,(12):3460-3471.
[33]SEIFFERT C,KHOSHGOFTAAR T M,HULSE J V,et al.RUSBoost:a hybrid approach to alleviating class imbalance [J].IEEE Transactions on Systems,Man,and Cybernetics,2010,40(1):185-197.
[1] CHEN Zhi-qiang, HAN Meng, LI Mu-hang, WU Hong-xin, ZHANG Xi-long. Survey of Concept Drift Handling Methods in Data Streams [J]. Computer Science, 2022, 49(9): 14-32.
[2] ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[3] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[4] TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[5] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[6] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[7] GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang. Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features [J]. Computer Science, 2022, 49(7): 40-49.
[8] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[9] ZHANG Hong-bo, DONG Li-jia, PAN Yu-biao, HSIAO Tsung-chih, ZHANG Hui-zhen, DU Ji-xiang. Survey on Action Quality Assessment Methods in Video Understanding [J]. Computer Science, 2022, 49(7): 79-88.
[10] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[11] LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[12] DENG Kai, YANG Pin, LI Yi-zhou, YANG Xing, ZENG Fan-rui, ZHANG Zhen-yu. Fast and Transmissible Domain Knowledge Graph Construction Method [J]. Computer Science, 2022, 49(6A): 100-108.
[13] HUANG Shao-bin, SUN Xue-wei, LI Rong-sheng. Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network [J]. Computer Science, 2022, 49(6A): 119-124.
[14] LIN Xi, CHEN Zi-zhuo, WANG Zhong-qing. Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning [J]. Computer Science, 2022, 49(6A): 144-149.
[15] KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao. Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution [J]. Computer Science, 2022, 49(6A): 150-158.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!