计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 585-587.doi: 10.11896/jsjkx.210100205

• 信息安全 • 上一篇    下一篇

基于AVX指令集的Sketch算法优化研究

谭玲玲, 杨飞, 易军凯   

  1. 北京信息科技大学自动化学院 北京100192
  • 出版日期:2021-11-10 发布日期:2021-11-12
  • 通讯作者: 谭玲玲(tanlingling@bistu.edu.cn)
  • 基金资助:
    NSFC-通用技术基础研究国家自然科学基金联合基金重点基金项目(U1636208)

Optimization Study of Sketch Algorithm Based on AVX Instruction Set

TAN Ling-ling, YANG Fei, YI Jun-kai   

  1. Institute of Automation,Beijing Information Science and Technology University,Beijing 100192,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:TAN Ling-ling,born in 1986,Ph.D,lecturer.Her main research interests include active and passive filter design,electromagnetic compatibility simulation,and cyberspace security.
  • Supported by:
    NSFC-General Technology Basic Research National Natural Science Foundation of China(U1636208).

摘要: 网络流量检测是网络测量中最基础也是最重要的一环。基于Sketch的侦测方法可对网络数据流进行计数,此计数在网络流量的异常检测中具有区分大象流,并对异常流量进行检测和定位的作用。Sketch的实现过程中因用到多个哈希函数而对内存资源的要求较高,针对 Sketch的实现中应用多个哈希函数的性能瓶颈问题,提出了一种基于效率高、成熟度高的AVX指令集来提高Sketch性能的方法,研究了对CPU指令的消耗和算法计算效率的影响。首先,将数据流的元素用向量的形式来描述和存储,运用AVX指令实现向量的构造和运算。其次,将多次哈希运算简化成一次向量运行,降低Sketch对CPU计算资源的消耗,提升了多次哈希函数的综合性能,使得提升Sketch优化性能成为可能。最后,运用AVX指令集对Count-Min Sketch算法程序中的关键函数进行优化,并对优化后的代码进行测试分析。实验结果表明:在哈希函数的运算方面,AVX优化版本消耗的时间为原始版本的25%;当数据长度较短时,多个哈希函数消耗的指令在整个Sketch中占比较少,AVX优化版本消耗的时间约为原始版本的70%;随着数据长度逐渐增大,多个哈希函数消耗的指令在整个Sketch中占比也逐渐增大,AVX优化版本消耗的时间降为原始版本的40%。实验的仿真结果验证了该算法在提高网络数据流测量效率方面的有效性。

关键词: AVX指令集, CPU加速, Sketch, 网络测量

Abstract: Network traffic detection is the most basic and important part of network measurement.The detection method based on Sketch can count data flow of network,which can distinguish elephant flow,detect and locate abnormal traffic in abnormal detection of network traffic.Higher requirement of the memory resources is needed for multiple hash functions used in the implementation of Sketch.To address the restriction problems raised by multiple hash functions applied in the realization of Sketch,a me-thod of improving the performance of Sketch algorithm using AVX instruction set is put forward,and also the effects on CPU instruction consumption and algorithm computing efficiency are studied,which had high efficiency and high maturity.Firstly,the elements of data stream are described and stored in the form of a vector,utilizing AVX instructions to realize structures and opera-tions of vectors.Secondly,multiple hash operations are transformed into a vector operation,which reduces the CPU resources consumption of the Sketch algorithm,improves the comprehensive performance of multiple hash functions,and made it possible to improve the optimized performance of Sketch algorithm.Finally,AVX instruction set is used to optimize the key functions in the Count-Min Sketch algorithm program,and the optimized code is tested and analyzed.The experimental results show that the AVX optimized version takes 25% time of the original version in the operation of hash function.When the data length is short,the instructions consumed by multiple hash functions are relatively small in the whole Sketch,and the AVX optimized version consumes about 70% time of the original version.As the length of the data increases,instructions consumed by multiple hash functions take up a larger proportion in the whole Sketch algorithm,and the time consumed by the AVX optimized version can be reduced to 40% of the original version.The simulation results demonstrate the effectiveness of the proposed algorithm in improving the measurement efficiency of network data flow.

Key words: AVX instruction set, CPU speed, Network measurement, Sketch

中图分类号: 

  • TP309
[1]LU Y Q,MAO Z S,CHENG Z,et al.SDN topology attack and defense[J].Journal of South China University of Technology:Natural Science,2020,48(11):114-122.
[2]XU Y H,SUN Z X.Research Development of Abnormal Traffic Detection in Software Defined Networking[J].Journal of Software,2020,31(1):183-207.
[3]LUO L.Anomaly Detection of Backbone Network Based on Dimensionality Reduction[D].Hefei:University of Science and Technology of China,2015.
[4]DING Y.Research on SDN-oriented Flooding Attack Defenseand Recovery Mechanism[D].Tianjin:Civil Aviation University of China,2020.
[5]NIU W N,JIANG T Y,ZHANG X S,et al.Fast-flux Botnet De-tection Method Based on Spatiotemporal Feature of Network Traffic[J].Journal of Electronics & Information Technology,2020,4(8):1872-1880.
[6]LUO L,YIN B Q,CAO J.Anomaly Analysis and Identification of Backbone network Based on Sketch and Regularity Distribution[J].Journal of Systems Science and Mathematics,2015,35(1):1-8.
[7]DOU F F.Research on Sketch-based Data Streams Mining ofFrequent Itemsets[D].Xi'an:Xidian Univeristy,2012.
[8]JING X Y,ZHAO J J,ZHENG Q H,et al.A reversible sketch-based method for detecting and mitigating amplification attacks[J].Journal of Network and Computer Applications,2019,142:15-24.
[9]CORMODE G,MUTHUKRISHNAN S.An improved datastream summary:the count-min sketch and its applications[J].Journal of Algorithms,2005,55(1):58-75.
[10]FENG W F,GUO Q,GUAN Z T,et al.Hierarchical Count-Min Sketch Data Structure for Data Streams[J].Computer Engineering,2007,33(14):20-23.
[11]RANA B,RWALV A,HALEPOVIC E,et al.Modeling Webquality-of-experience on cellular networks [C]//Proceedings of 20th Annual International Conference on Mobile Computing and Networking.New York:ACM,2014:213-224.
[12]ZHU L,FENG Y,ZHU L,et al.Optimization of H.264 encoder based on SIMD technology[J].Computer Applications,2005,25(12):2799-2802.
[13]SUN H M,DU B Y,ZHENG P,et al.BWT algorithm based on AVX instructions in the application of the DNA sequence alignment[J].Journal of Northeast Agricultural University,2016,47(11):93-99.
[14]JIANG W B,WANG H B,LIU P,et al.Hybrid Computational Strategy for Deep Learning Based on AVX2[J].Journal of Tsinghua University,2020,60(5):408-414.
[1] 赵茜, 陈曙晖.
基于LRBG方法的IP定位研究
LRBG-based Approach for IP Geolocation
计算机科学, 2020, 47(11A): 291-295. https://doi.org/10.11896/jsjkx.200300078
[2] 乔焰,焦俊,饶元.
基于数据中心流量特征的端到端流量估计算法
Traffic Estimation for Data Center Network Based on Traffic Characteristics
计算机科学, 2017, 44(2): 171-175. https://doi.org/10.11896/j.issn.1002-137X.2017.02.026
[3] 王晶,汪斌强,申涓.
一种基于测量构件变迁模型的可重构测量构件一致性检测方法
Measurement Component Transfer Model-based Conformance Testing Approach of Reconfigurable Measurement Component
计算机科学, 2015, 42(9): 165-170. https://doi.org/10.11896/j.issn.1002-137X.2015.09.032
[4] 张荣,金跃辉,杨 谈,荣自瞻.
分布式网络测量中测量节点的智能选择算法
Intelligent Selection Algorithm of Measurement Nodes in Distributed Network Measurement
计算机科学, 2015, 42(9): 70-77. https://doi.org/10.11896/j.issn.1002-137X.2015.09.015
[5] 张成伟,程文青,黑晓军.
基于Android平台的3G移动网络测量研究及性能分析
Measurement Study of 3G Mobile Networks Using Android Platform
计算机科学, 2015, 42(2): 24-28. https://doi.org/10.11896/j.issn.1002-137X.2015.02.005
[6] 唐军,裴昌幸,苏博.
IPv6网络路径容量测量方法研究
Research on Path Capacity Estimation for IPv6 Networks
计算机科学, 2011, 38(Z10): 312-314.
[7] 王勇,云晓春,秦志光,郭莉,程红蓉.
P2P网络数据污染综述
Survey on P2P Network Pollutions
计算机科学, 2011, 38(3): 1-4.
[8] 刘琼,刘珍,黄敏.
基于机器学习的IP流量分类研究
Study on Internet Traffic Classification Using Machine Learning
计算机科学, 2010, 37(12): 35-40.
[9] 柳斌 李之棠 李战春 周丽娟.
一种基于Netfilter的BitTorrent流量测量方法

计算机科学, 2007, 34(4): 38-41.
[10] 高文宇 陈松乔 王建新.
分组采样技术研究

计算机科学, 2005, 32(2): 56-59.
[11] 王俊峰 周明天.
高速网络性能测量研究

计算机科学, 2004, 31(9): 66-71.
[12] 周轶刚 徐莹.
Internet测量基础架构的研究

计算机科学, 2002, 29(9): 104-106.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!