计算机科学 ›› 2014, Vol. 41 ›› Issue (Z11): 301-306.

• 数据挖掘 • 上一篇    下一篇

基于子空间聚类算法的流量分类方法研究

许学研,王苏南,吴春明   

  1. 浙江大学计算机科学与技术学院 杭州310027;浙江大学计算机科学与技术学院 杭州310027;深圳职业技术学院电子与通信工程学院 深圳518005;浙江大学计算机科学与技术学院 杭州310027
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家重点基础研究发展计划(973计划)资金项目(2012CB315903),国家自然科学基金项目(61103200,61379118),浙江省重点科技创新团队(2011R50010)资助

Network Traffic Classification Method Research Based on Subspace Clustering Algorithm

XU Xue-yan,WANG Su-nan and WU Chun-ming   

  • Online:2018-11-14 Published:2018-11-14

摘要: 目前网络流量业务类型具有不断变化和业务特征不断更新两大特点,但是,现有的流量分类器由于存在业务特征库更新代价大、误判率高等缺点,而无法满足正常的业务分类需求。因此需要设计一种子空间聚类算法来实现业务分类精细化,保障分类精确率、召回率以及效率等特性。实验验证表明,子空间聚类算法的业务分类精细化程度高,分类精确率平均超过95%,训练数据需求量低,并且这类方法对于改进DPI分类器对网络环境的适应能力有重大意义。

关键词: 深度包检测,机器学习,流量分类,子空间聚类

Abstract: Currently,service types,features of network traffic are changing constantly,but existing classification methods aren’t able to satisfy such network traffic environment,because they lack capability to update features library efficiently,and have high misjudgement rate.So a subspace clustering algorithm was designed to test classification properties.Experemnts show that it can classify lots of business types,its classification precision rate exceeds 95%,and quantity demand of training samples is low.It is recommended to help DPI classifier adapt to changing network environment.

Key words: Deep packet inspection,Machine learing,Traffic classification,Subspace clustering

[1] Chandrashekar J,Zhang Z L,Zhenhai D,et al.Towards a service oriented internet[J].IEICE transactions on communications,2006,89(9):2292-2299
[2] Srinivasan S R,Lee J W,Liu E,et al.Netserv:Dynamically deploying in-network services[C]∥Proceedings of the 2009 workshop on Re-architecting the internet.ACM,2009:37-42 (下转第319页)(上接第306页)
[3] Femminella M,Francescangeli R,Reali G,et al.An enablingplatform for autonomic management of the future internet[J].Network,IEEE,2011,25(6):24-32
[4] Arthur C,Carlos K,Stênio F,et al.A Survey on Internet Traffic Identification and Classification[J].Communications Surveys and Tutorials,IEEE,2009,11(3):37-52
[5] Moore A,Zuev D,Crogan M.Discriminators for use in flow-based classification[M].Queen Mary and Westfield College,Department of Computer Science,2005
[6] Szabó G,Szabó I,Orincsay D.Accurate traffic classification[C]∥IEEE International Symposium on a World of Wireless,Mobile and Multimedia Networks,2007(WoWMoM 2007).IEEE,2007:1-8
[7] Kumar S,Dharmapurikar S,Yu F,et al.Algorithms to accelerate multiple regular expressions matching for deep packet inspection[J].ACM SIGCOMM Computer Communication Review,2006,36(4):339-350
[8] Smith R,Estan C,Jha S,et al.Deflating the big bang:fast and scalable deep packet inspection with extended finite automata[J].ACM SIGCOMM Computer Communication Review,2008,38(4):207-218
[9] Haffner P,Sen S,Spatscheck O,et al.ACAS:automated construction of application signatures[C]∥Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data.ACM,2005:197-202
[10] Moore A W,Zuev D.Internet traffic classification using bayesian analysis techniques[J].ACM SIGMETRICS Performance Eva-luation Review,2005,33(1):50-60
[11] Williams N,Zander S,Armitage G.A preliminary performancecomparison of five machine learning algorithms for practical IP traffic flow classification[J].ACM SIGCOMM Computer Communication Review,2006,36(5):5-16
[12] Bernaille L,Teixeira R,Akodkenou I,et al.Traffic classification on the fly[J].ACM SIGCOMM Computer Communication Review,2006,36(2):23-26
[13] Erman J,Arlitt M,Mahanti A.Traffic classification using clustering algorithms[C]∥Proceedings of the 2006 SIGCOMM workshop on Mining network data.ACM,2006:281-286
[14] Park B C,Won Y J,Kim M S,et al.Towards automated application signature generation for traffic identification[C]∥IEEE Network Operations and Management Symposium,2008(NOMS 2008).IEEE,2008:160-167
[15] Ye M,Xu K,Wu J,et al.Autosig-automatically generating signatures for applications[C]∥Ninth IEEE International Confe-rence on Computer and Information Technology,2009(CIT’09).IEEE,2009,2:104-109
[16] Szabó G,Turányi Z,Toka L,et al.Automatic protocol signature generation framework for deep packet inspection[C]∥Procee-dings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools.ICST(Institute for Computer Sciences,Social-Informatics and Telecommunications Engineering),2011:291-299
[17] Karagiannis T,Papagiannaki K,Faloutsos M.BLINC:multilevel traffic classification in the dark[J].ACM SIGCOMM Computer Communication Review,2005,35(4):229-240
[18] Bujlow T,Riaz T,Pedersen J M.A method for classification of network traffic based on C5.0 Machine Learning Algorithm[C]∥2012 International Conference on Computing,Networking and Communications(ICNC).IEEE,2012:237-241
[19] Parsons L,Haque E,Liu H.Subspace clustering for high dimensional data:a review[J].ACM SIGKDD Explorations Newsletter,2004,6(1):90-105
[20] Müller E,Günnemann S,Assent I,et al.Evaluating clustering in subspace projections of high dimensional data[J].Proceedings of the VLDB Endowment,2009,2(1):1270-1281
[21] Agrawal R,Gehrke J E,Gunopulos D,et al.Automatic subspace clustering of high dimensional data for data mining applications:U.S.Patent 6,3,029[P].1999-12-14
[22] Cheng C H,Fu A W,Zhang Y.Entropy-based subspace clustering for mining numerical data[C]∥Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,1999:84-93
[23] Goil S,Nagesh H,Choudhary A.MAFIA:Efficient and scalable subspace clustering for very large data sets[C]∥Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.1999:443-452
[24] Xie G,Iliofotou M,Keralapura R,et al.SubFlow:towards practical flow-level traffic classification[C]∥INFOCOM,2012 Proceedings IEEE.IEEE,2012:2541-2545

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!