计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 268-278.doi: 10.11896/jsjkx.240900059

• 计算机网络 • 上一篇    下一篇

SDN网络中基于分布式Sketch的流基数估计方法

王艺洁1, 高国举1, 孙玉娥2, 黄河1   

  1. 1 苏州大学计算机科学与技术学院 江苏 苏州 215006
    2 苏州大学轨道交通学院 江苏 苏州 215131
  • 收稿日期:2024-09-10 修回日期:2024-11-18 出版日期:2025-02-15 发布日期:2025-02-17
  • 通讯作者: 高国举 (gjgao@suda.edu.cn)
  • 作者简介:(20235227061@stu.suda.edu.cn)
  • 基金资助:
    国家自然科学基金(62472298,62332013,62102275,62072322,U20A20182,62202322);中国博士后基金(2023M732537);江苏省自然科学基金(BK20210704)

Flow Cardinality Estimation Method Based on Distributed Sketch in SDN

WANG Yijie1, GAO Guoju1, SUN Yu'e2, HUANG He1   

  1. 1 School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
    2 School of Rail Transportation,Soochow University,Suzhou,Jiangsu 215131,China
  • Received:2024-09-10 Revised:2024-11-18 Online:2025-02-15 Published:2025-02-17
  • About author:WANG Yijie,born in 2001,postgra-duate.Her main research interest is network traffic measurement.
    GAO Guoju,born in 1990.Ph.D,asso-ciate professor,is a member of CCF(No.C5973M).His main research interests include network traffic measurement,intelligent sensing and computing.
  • Supported by:
    National Natural Science Foundation of China(62472298,62332013,62102275,62072322,U20A20182,62202322),China Postdoctoral Science Foundation(2023M732537) and Natural Science Foundation of Jiangsu Province,China(BK20210704).

摘要: 在软件定义网络(SDN)中,统计流基数信息对于流量工程、流量重路由和攻击检测等很多应用都具有重要作用。现有的研究工作主要分为在单交换机上部署测量结构和多交换机协同测量两类。然而,这两类方案都无法实现全网流量覆盖,并且多交换机协同测量通常采用每个交换机独立测量再合并的方式,容易导致重复计数。为了解决上述问题,提出了一种基于分布式Sketch的流基数估计方法。该方法充分利用SDN控制平面集中控制的优势,协同利用每条流的最长连续公共子路径上的交换机,以构建该流的逻辑层计数结构。同时,建立了每条流从逻辑层计数结构到物理交换机空间的映射,使参与计数的交换机可以根据自身状态和实际负载情况动态调整映射区间,以实现全网交换机之间的负载均衡。以vHLL算法为例,实现了一个分布式流基数估计方法的原型,并在四元Fat-Tree网络拓扑下使用真实网络流量数据集进行了实验评估。实验结果表明,所提方法能够有效实现全网每流基数估计,在准确性方面,其ARE和AAE值最多优于对比实验方法94.7%和93.8%;在负载均衡方面,该方法充分利用全网所有交换机,其归一化后的平均数据包负载为0.394,低于所有对比方法,表现良好。

关键词: SDN网络, 流基数测量, 分布式网络, Sketch, 负载均衡

Abstract: In software-defined networks(SDN),statistical flow cardinality information plays a crucial role in various applications such as traffic engineering,traffic rerouting,and attack detection.Existing research primarily falls into two categories:measurement structures deployed on a single switch and collaborative measurements across multiple switches.However,neither of the two approaches achieves full network flow coverage,and collaborative measurements often employ independent measurements from each switch followed by merging,which can lead to duplicate counting.To address these issues,a flow cardinality estimation method based on distributed sketches is proposed.This method leverages the advantages of centralized control in the SDN control plane and collaboratively utilizes switches along the longest continuous common sub-path of each flow to construct a logical layer counting structure for that flow.Additionally,a mapping from the logical layer counting structure to the physical switch space for each flow is established,allowing participating switches to dynamically adjust the mapping intervals based on their own states and actual load conditions,thus achieving load balancing among all switches in the network.Taking the vHLL algorithm as an example,we implement a prototype of the distributed flow cardinality estimation method and conduct experimental evaluations using a real network traffic dataset on a four-layer Fat-Tree network topology.The experimental results demonstrate that the proposed method effectively achieves flow cardinality estimation across the entire network.In terms of accuracy,its average relative error(ARE) and average absolute error(AAE) values outperform the comparison experiments by up to 94.7% and 93.8%,respectively.Regarding load balancing,the method fully utilizes all switches in the network,achieving a normalized average packet load of 0.394,which is lower than that of the comparison methods,indicating its good performance.

Key words: Software-defined network, Flow cardinality measurement, Distributed network, Sketch, Load balancing

中图分类号: 

  • TP391
[1]HONG C Y,KANDULA S,MAHAJAN R,et al.Achievinghigh utilization with software-driven WAN[C]//Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM.New York:Association for Computing Machinery,2013:15-26.
[2]JAIN S,KUMAR A,MANDAL S,et al.B4:Experience with a globally-deployed software defined WAN[J].ACM SIGCOMM Computer Communication Review,2013,43(4):3-14.
[3]XU H,YU Z,LI X Y,et al.Joint route selection and updatescheduling for low-latency update in SDNs[J].IEEE/ACM Transactions on Networking,2017,25(5):3073-3087.
[4]XU H,HUANG H,CHEN S,et al.Achieving high scalability through hybrid switching in software-defined networking[J].IEEE/ACM Transactions on Networking,2018,26(1):618-632.
[5]LIU Z,MANOUSIS A,VORSANGER G,et al.One sketch to rule them all:Rethinking network flow monitoring with univmon[C]//Proceedings of the 2016 ACM SIGCOMM Confe-rence.New York:Association for Computing Machinery,2016:101-114.
[6]YANG T,ZHANG H,LI J,et al.HeavyKeeper:an accurate algorithm for finding Top-$ k $ elephant flows[J].IEEE/ACM Transactions on Networking,2019,27(5):1845-1858.
[7]ZHAO B,LI X,TIAN B,et al.Dhs:Adaptive memory layout organization of sketch slots for fast and accurate data stream processing[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.New York:Association for Computing Machinery,2021:2285-2293.
[8]YANG T,JIANG J,LIU P,et al.Elastic sketch:Adaptive and fast network-wide measurements[C]//Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.New York:Association for Computing Machinery,2018:561-575.
[9]LI T,CHEN S,LUO W,et al.Scan detection in high-speed networks based on optimal dynamic bit sharing[C]//2011 Procee-dings IEEE INFOCOM.Montreal:IEEE,2011:3200-3208.
[10]XIAO Q,CHEN S,ZHOU Y,et al.Cardinality estimation for elephant flows:A compact solution based on virtual register sharing[J].IEEE/ACM Transactions on Networking,2017,25(6):3738-3752.
[11]JIA P,WANG P,ZHANG Y,et al.Accurately estimating user cardinalities and detecting super spreaders over time[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(1):92-106.
[12]ZHANG H,HUANG H,SUN Y E,et al.MIME:Fast and Accurate Flow Information Compression for Multi-Spread Estimation[C]//2023 IEEE 31st International Conference on Network Protocols(ICNP).Montreal:IEEE,2023:1-11.
[13]JING X,HAN H,YAN Z,et al.SuperSketch:A multi-dimen-sional reversible data structure for super host identification[J].IEEE Transactions on Dependable and Secure Computing,2021,19(4):2741-2754.
[14]BASAT R B,EINZIGER G,TAYH B.Cooperative network-wide flow selection[C]//2020 IEEE 28th International Confe-rence on Network Protocols(ICNP).Montreal:IEEE,2020:1-11.
[15]AL-FARES M,LOUKISSAS A,VAHDAT A.A scalable,commodity data center network architecture[J].ACM SIGCOMM Computer Communication Review,2008,38(4):63-74.
[16]LI Y,YU X,YANG Y,et al.Pyramid family:Generic frameworks for accurate and fast flow size measurement[J].IEEE/ACM Transactions on Networking,2021,30(2):586-600.
[17]ESTAN C,VARGHESE G,FISK M.Bitmap algorithms forcounting active flows on high speed links[C]//Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement.New York:Association for Computing Machinery,2003:153-166.
[18]FLAJOLET P,MARTIN G N.Probabilistic counting algorithms for data base applications[J].Journal of Computer and System Sciences,1985,31(2):182-209.
[19]DURAND M,FLAJOLET P.Loglog counting of large cardinali-ties[C]//Algorithms-ESA 2003:11th Annual European Symposium,Budapest,Hungary,September 16-19,2003.Proceedings 11.Berlin Heidelberg:Springer,2003:605-617.
[20]FLAJOLET P,FUSY É,GANDOUET O,et al.Hyperloglog:the analysis of a near-optimal cardinality estimation algorithm[J] Discrete Mathematics & Theoretical Computer Science,2007:137-156.https://dmtcs.episciences.org/3545/pdf.
[21]DU Y,HUANG H,SUN Y E,et al.A Better Cardinality Estimator with Fewer Bits,Constant Update Time,and Mergeability[C]//IEEE INFOCOM 2023-IEEE Conference on Computer Communications.Montreal:IEEE,2023:1-10.
[22]TANG L,HUANG Q,LEE P P C.SpreadSketch:Toward in-vertible and network-wide detection of superspreaders[C]//IEEE INFOCOM 2020-IEEE Conference on Computer Communications.Montreal:IEEE,2020:1608-1617.
[23]JING X,YAN Z,HAN H,et al.ExtendedSketch:Fusing network traffic for super host identification with a memory efficient sketch[J].IEEE Transactions on Dependable and Secure Computing,2021,19(6):3913-3924.
[24]CAIDA.Anonymized internet traces 2019 [EB/OL].(2019-01)[2024-08-08].https://catalog.caida.org/details/dataset/passive_2019_pcap.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!