基于Kmeans++聚类的朴素贝叶斯集成方法研究

Abstract

Abstract: Naive Bayes is widely applied because of its simple method,high computation efficiency,high accuracy and solid the oretical foundation.Since the difference is a key condition of ensemble learning,this paper studied the method for improving the ensemble difference of naive Bayes classifier based on kmeans++ clustering technology,so as to improve the generalization performance of naive Bayes.Firstly,plurality of naive Bayesian classifier models are trained through a training sample set.In order to increase the difference between the base classifiers,Kmeans++ algorithm is used to cluster the prediction results of the base classifiers on the verification set.Finally,the base classifier with the best generalization performance is selected from each cluster for ensemble learning,and the final result is obtained by simple voting method.UCI standard data sets are used to verify the algorithm at the end of this paper,and its generalization performance has been greatly improved.

Key words: Difference, Esemble learning, Kmeans++ clustering, Naive bayes

CLC Number:

TP391

ZHONG Xi, SUN Xiang-e. Research on Naive Bayes Ensemble Method Based on Kmeans++ Clustering[J].Computer Science, 2019, 46(6A): 439-441.

References

[1]周志华.机器学习[M].北京:清华大学出版社,2016:2-4.
[2]HARRINGTON P.机器学习实战[M].李锐,李鹏,曲亚东,等译.北京:人民邮电出版社,2013:171-173.
[3]DIETTERICH T G.Machine learning research:four current directions[J].AI Magazine,1997,18(4):97-136.
[4]ZHOU Z H,WU J,TANG W.Ensembling neural networks: many could be better than all[J].Artificial intelligence,2002,137(1):239-263.
[5]BLACK C,KEOGH E,MERZ C J.UCI repository of machine lear-ningdatabase[EB/OL].http://www.ics.uci.edu/~mlearn/MLReposito-ry.html.1998.
[6]郭英明,李虹利.基于斯皮尔曼系数的加权朴素贝叶斯分类算法研究[J].信息与电脑,2018(13):57-59.
[7]JIANG Q,WANG W,HAN X,et al.Deep feature weighting in Nai-ve Bayes for Chinese text classification[C]∥International Conference on Cloud Computing and Intelligence Systems(CCIS).Beijing,2016:160-164.
[8]邓广彪,黄振功,岳晓光.基于Nesterov平滑的高阶路径朴素贝叶斯文本隐式分类研究[J].西南师范大学学报(自然科学版),2018,43(7):107-112.
[9]KATKAR V D,KULKARNI S V.A novel parallel implementation of Naive Bayesian classifier for Big Data[C]∥International Conferen-ce on Green Computing,Communication and Conservation of Energy (ICGCE).Chennai,2013:847-852.
[10]ZAGORECKIA.Feature selection for naive Bayesian network ensemble using evolutionary algorithms[C]∥Federated Conference on Computer Science and Information Systems.Warsaw,2014:381-385.
[11]TSYMBAL A,PUURONEN S,PATTERSON D W.Ensemble f-eature selection with the simple Bayesian classification[J].Information Fusion,2003,4(2):87-100.
[12]张剑飞,刘克会,杜晓昕.基于k阶依赖扩展的贝叶斯网络分类器集成学习算法[J].东北师大学报(自然科学版),2016,48(1):65-71.
[13]王玲娣,徐华.一种基于聚类和AdaBoost的自适应集成算法[J].吉林大学学报(理学版),2018,56(4):917-924.
[14]GIACINTO G,ROLI F.Design of effective neural network ense-mbles for image classification purposes[J].Image and Vision Comput-ing,2001,19(9):699-707.
[15]何梦娇,杨燕,王淑营.一种基于非负矩阵分解的聚类集成算法[J].计算机科学,2017,44(9):58-61.
[16]HAN J W,KAMBER M.数据挖掘概念与技术[M].范明,孟小锋,译.北京:机械工业出版社,2000:173-175.
[17]ARTHUR D,VASSILVITSKII S.k-means++:the advantages of careful seeding[C]∥In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms.New Orleans,SIAM,2007:1027-1035.
[18]BREIMAN L.Bagging predictors[J].Machine learning,1996,24(2):123-140.
[19]KROGN A,VEDLEBSBY J.Neural network ensembles,cross v-alidation and active learning[C]∥International Conference on Neural Information Processing Systems.MIT Press,1994:231-238.
[20]李凯,李昆仑,崔丽娟.模型聚类及在集成学习中的应用研究[J].计算机研究与发展,2007(S2):203-207.

Related Articles 15

[1]	HAO Jie, PING Ping, FU De-yin, ZHAO Hong-ze. Bi-histogram Shifting Reversible Data Hiding Method After Compressed Differences [J]. Computer Science, 2022, 49(9): 340-346.
[2]	CHEN Zhuang, ZOU Hai-tao, ZHENG Shang, YU Hua-long, GAO Shang. Diversity Recommendation Algorithm Based on User Coverage and Rating Differences [J]. Computer Science, 2022, 49(5): 159-164.
[3]	WANG Sheng, ZHANG Yang-sen, CHEN Ruo-yu, XIANG Ga. Text Matching Method Based on Fine-grained Difference Features [J]. Computer Science, 2021, 48(8): 60-65.
[4]	LEI Jian-mei, ZENG Ling-qiu, MU Jie, CHEN Li-dong, WANG Cong, CHAI Yong. Reverse Diagnostic Method Based on Vehicle EMC Standard Test and Machine Learning [J]. Computer Science, 2021, 48(6): 190-195.
[5]	WANG Ying-ying, CHANG Jun, WU Hao, ZHOU Xiang, PENG Yu. Intrusion Detection Method Based on WiFi-CSI [J]. Computer Science, 2021, 48(6): 343-348.
[6]	ZHAN Rui, LEI Yin-jie, CHEN Xun-min, YE Shu-han. Street Scene Change Detection Based on Multiple Difference Features Network [J]. Computer Science, 2021, 48(2): 142-147.
[7]	JIANG Chong, ZHANG Zong-zhang, CHEN Zi-xuan, ZHU Jia-cheng, JIANG Jun-peng. Data Efficient Third-person Imitation Learning Method [J]. Computer Science, 2021, 48(2): 238-244.
[8]	DAN Zhou-yang, LIU Fen-lin, GONG Dao-fu. Smoothing Filter Detection Algorithm Based on Middle and Tail Information of Differential Histogram [J]. Computer Science, 2021, 48(11): 234-241.
[9]	DING Qing-feng, XI Tao, LIAN Yi-chong, WU Ze-xiang. Antenna Selection for Spatial Modulation Based on Physical Layer Security [J]. Computer Science, 2020, 47(7): 322-327.
[10]	YU Meng-chi, MU Jia-peng, CAI Jian, XU Jian. Noisy Label Classification Learning Based on Relabeling Method [J]. Computer Science, 2020, 47(6): 79-84.
[11]	LI Bin, LIU Quan. Double Weighted Learning Algorithm Based on Least Squares [J]. Computer Science, 2020, 47(12): 210-217.
[12]	LI Kai-wen, XU Lin, CHEN Qiang. Relative Image Quality Assessment Based on CPNet [J]. Computer Science, 2020, 47(11): 159-167.
[13]	WANG Qia, QI Yong. Method for Traffic Video Background Modeling Based on Inter-frame Difference and Statistical Histogram [J]. Computer Science, 2020, 47(10): 174-179.
[14]	SHI Xiao-ling, CHEN Zhi, YANG Li-gong, SHEN Wei. Matrix Factorization Recommendation Algorithm Based on Adaptive Weighted Samples [J]. Computer Science, 2019, 46(6A): 488-492.
[15]	ZHAO Ning-bo, LIU Wei, LUO Rong, HU Shun-ren^1,3. Optimization Model of Working Mode Transformation Strategies for Wireless Sensor Nodes [J]. Computer Science, 2019, 46(5): 44-49.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Research on Naive Bayes Ensemble Method Based on Kmeans++ Clustering

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0