计算机科学 ›› 2011, Vol. 38 ›› Issue (11): 26-29.

• 计算机网络与信息安全 • 上一篇    下一篇

基于决策树集成的P2P流量识别研究

刘三民,孙知信,刘余霞   

  1. (南京航空航天大学信息科学与技术学院 南京210016)(南京邮电大学计算机技术研究所 南京210003)(南京大学计算机软件新技术国家重点实验室 南京210093)(安徽工程大学计算机与信息学院 芜湖241000)4(安徽工程大学电气工程学院 芜湖241000)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(609937140) ,江苏省自然基金(BK20099425) ,江苏省高校自然科学基础研究项目(08KJB520005),江苏省六大才高峰项目,安徽工程大学校青年基金项目(2008YQ041)资助。

Research of P2P Traffic Identification Based on Decision Tree Ensemble

LIU San-min,SUN Zhi-xin,LIU Yu-xia   

  • Online:2018-12-01 Published:2018-12-01

摘要: 为提高分类模型的稳定性,提出基于决策树分类器集成方案用以识别流量。模型首先利用特征选择方法(FCBF)提取最优分类特征信息,按Bagging随机抽样原理形成5个子分类器,依少数服从多数原则生成决策模型。利用两种实验方案在公开数据集上进行测试,结果表明提出的方案比贝叶斯、基于核密度估计贝叶斯方案具有更好的稳定性、模型分类准确率和P2P流量识别准确率,并对此现象进行了解释。

关键词: 流量识别,集成学习,决策树,贝叶斯分类,稳定性

Abstract: A novel P2P traffic identification method based on decision tree ensemble was proposed for improving the model stability. First, the most optimal feature set was extracted by using fast correlation based filter(FCBF) , and then the decision model based on five sub-classifier formed by Bagging was developed by the principle of the majority.Through test result comparison based on the open data set in the two distinct experiment scheme among the proposed model, naive bayes and naiva bayes based on kernel density estimation, it shows the proposed model owns a better stability, the high classification accuracy and P2P traffic identification accuracy and gives the explanation about this phenorEcnon.

Key words: Traffic identification,Ensemble learning,Decision tree,Bayes classification,Stabihty

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!