计算机科学 ›› 2024, Vol. 51 ›› Issue (4): 28-38.doi: 10.11896/jsjkx.231000119
毛晨宇1, 黄河1, 孙玉娥2, 杜扬1
MAO Chenyu1, HUANG He1, SUN Yu'e2, DU Yang1
摘要: 在分布式网络中,测量top-K频繁流对资源分配、安全监控等应用至关重要。现有的top-K频繁流测量工作存在不适用于测量分布式网络流量或只考虑单时间周期等局限。为此,提出了分布式网络中连续时间周期的全局top-K频繁流测量方案,在分布节点中布置了紧凑的概率数据结构来记录网络流信息,每个时间周期结束后分布节点向中心节点发送必要信息,中心节点汇聚得到从测量开始至当前时间周期的全局top-K频繁流。考虑到每条流可能出现在一个或多个测量节点,使用了不同的方法来减少传输开销。对于每条流只会出现在单一节点的情况,采用传输分段最小值的方法来获得阈值,实验结果表明这种方法减少了全量传输超过50%的传输开销。对于每条流会出现在多个节点的情况,提出了多阶段无误差处理方法和单阶段快速处理方法,分别应对不能容忍误差的场景和实际高速网络流量,相比每个时间周期都使用已有单周期方法,传输开销的实验表现降低了两个数量级。最后还提出了一种利用历史平均增值信息降低通信延迟的方法,实验结果表明该方法有效降低了限制信息的平均相对误差。
中图分类号:
[1]SHARMA H,THAKKAR P,BHARADWAJ S,et al.Optimi-zing Network Provisioning through Cooperation[C]//19th USENIX Symposium on Networked Systems Design and Implementation(NSDI 22).Renton:USENIX Association,2022:1179-1194. [2]HUANG H,SUN Y E,MA C Y,et al.Spread estimation with non-duplicate sampling in high-speed networks[J].IEEE/ACM Transactions on Networking,2021,29(5):2073-2086. [3]CAIDA.Anonymized internet traces 2019 [EB/OL].(2019-01)[2023-10-08].https://catalog.caida.org/details/dataset/passive_2019_pcap. [4]ZHAO Y K,HAN W C,ZHONG Z,et al.Double-Anonymous Sketch:Achieving Top-K-fairness for Finding Global Top-K Frequent Items[J].Proceedings of the ACM on Management of Data,2023,1(1):1-26. [5]SONG W,BESHLEY M,PRZYSTUPA K,et al.A softwaredeep packet inspection system for network traffic analysis and anomaly detection[J].Sensors,2020,20(6):1637. [6]JIN K,XIE K,TIAN J Z,et al.Low cost network traffic mea-surement and fast recovery via redundant row subspace-based matrix completion[J].Connection Science,2023,35(1):2218069. [7]YANG T,ZHANG H W,LI J Y,et al.HeavyKeeper:an accurate algorithm for finding Top-k elephant flows[J].IEEE/ACM Transactions on Networking,2019,27(5):1845-1858. [8]YANG T,JIANG J,LIU P,et al.Elastic sketch:Adaptive and fast network-wide measurements[C]//Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.New York:Association for Computing Machinery,2018:561-575. [9]LIU L T,SHEN Y L,YAN Y B,et al.Sf-sketch:a two-stage sketch for data streams[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(10):2263-2276. [10]DAI H P,LI M,LIU A,et al.Finding persistent items in distri-buted datasets[J].IEEE/ACM Transactions on Networking,2019,28(1):1-14. [11]LI M,DAI H P,WANG X Y,et al.Thresholded monitoring in distributed data streams[J].IEEE/ACM Transactions on Networking,2020,28(3):1033-1046. [12]BASAT R,EINZIGER G,TAYH B.Cooperative network-wide flow selection[C]//2020 IEEE 28th International Conference on Network Protocols(ICNP).Madrid:IEEE Press,2020:1-11. [13]CAO P,WANG Z.Efficient top-k query calculation in distributed networks[C]//Proceedings of the Twenty-third Annual ACM Symposium on Principles of Distributed Computing.New York:Association for Computing Machinery,2004:206-215. [14]MICHEL S,TRIANTAFILLOU P,WEIKUM G.Klee:Aframework for distributed top-k query algorithms[C]//Procee-dings of the 31st International Conference on Very Large Data Bases.New York:ACM,2005:637-648. [15]ANCEAUME E,BUSNEL Y,RIVETTI N,et al.Identifying global icebergs in distributed streams[C]//2015 IEEE 34th Symposium on Reliable Distributed Systems(SRDS).Montreal:IEEE Press,2015:266-275. [16]TARKOMA S,ROTHENBERG C,LAGERSPETZ E.Theory and practice of bloom filters for distributed systems[J].IEEE Communications Surveys & Tutorials,2011,14(1):131-155. [17]TING D.Data sketches for disaggregated subset sum and frequent item estimation[C]//Proceedings of the 2018 Interna-tional Conference on Management of Data.New York:Association for Computing Machinery,2018:1129-1140. [18]LI J Z,LI Z K,XU Y F,et al.Wavingsketch:An unbiased and generic sketch for finding top-k items in data streams[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.New York:Association for Computing Machinery,2020:1574-1584. [19]CORMODE G,MUTHUKRISHNAN S.An improved datastream summary:the count-min sketch and its applications[J].Journal of Algorithms,2005,55(1):58-75. [20]ESTAN C,VARGHESE G.New directions in traffic measurement and accounting[C]//Proceedings of the 2002 Conference on Applications,Technologies,Architectures,and Protocols for Computer Communications.New York:New Directions in Traffic Measurement and Accounting,2002:323-336. |
|