计算机科学 ›› 2025, Vol. 52 ›› Issue (10): 98-105.doi: 10.11896/jsjkx.241000033

• 数据库&大数据&数据科学 • 上一篇    下一篇

ACCF:时间预测机制驱动的top-k流测量

胡永庆, 杨含, 刘子源, 秦广军, 戴庆龙   

  1. 北京联合大学智慧城市学院 北京 100101
  • 收稿日期:2024-10-09 修回日期:2025-02-03 出版日期:2025-10-15 发布日期:2025-10-14
  • 通讯作者: 戴庆龙(xxtqinglong@buu.edu.cn)
  • 作者简介:(huyongqing2021@163.com)
  • 基金资助:
    面向海量事件流的流式计算框架的研究(CCFIS2019-01-01);智慧交通中的光与无线融合接入网组网技术研究(KM202111417010);拓扑结构引导的深度学习优化技术研究(ZK10202403)

ACCF:Time Prediction Mechanism-driven Top-k Flow Measurement

HU Yongqing, YANG Han, LIU Ziyuan, QING Guangjun, DAI Qinglong   

  1. Smart City College of Beijing Union University,Beijing 100101,China
  • Received:2024-10-09 Revised:2025-02-03 Online:2025-10-15 Published:2025-10-14
  • About author:HU Yongqing,born in 1999,postgra-duate,is a member of CCF (No.G9824G).His main research interests include network traffic measurement and software-defined networking.
    DAI Qinglong,born in 1988,Ph.D,associate professor,master supervisor,is a member of CCF (No.H6517M).His main research interests include edge computing,big data and cloud computing.
  • Supported by:
    Research on Stream Computing Frameworks for Massive Event Streams(CCFIS2019-01-01),Research on Optical and Wireless Converged Access Network Technologies in Smart Transportation Systems(KM202111417010)and Research on Topology-Guided Deep Learning Optimization Techniques(ZK10202403).

摘要: 针对当前top-k流测量过滤算法依赖固定计数器阈值的问题,提出了基于活跃度预测机制的ACCF(Activity Counting Cuckoo Filter)测量结构。ACCF通过引入活跃度预测机制,利用时间序列分析和指数加权移动平均(Exponentially Weighted Moving Average,EWMA)机制,动态计算网络流的活跃度,实现对潜在的top-k流的实时识别与提前过滤。针对哈希冲突可能导致的精度损失,ACCF引入了自刷新存储表(Self-Refreshing Storage Table,SRST),用于存储踢出路径上的网络流信息。当踢出操作达到设定的MaxNumKicks值时,SRST会在局部范围内优先踢出活跃度最小的网络流项,避免重要流量信息丢失。实验结果证明,ACCF与SRST在合适的参数组合条件下,可以提前过滤65%左右的大流并减少41%左右的插入操作,并显著提升了在top-k流量测量中的精度,尤其是在与传统的Space Saving(SS),CM Sketch,LUSketch和Cuckoo Counter算法对比时,展现了明显的优势。

关键词: top-k, 活跃度, 时间序列, EWMA, SRST, Sketch

Abstract: In addressing the problem that current top-k flow measurement filtering algorithms depend on fixed counter thresholds,a measurement structure named ACCF based on the activity prediction mechanism has been put forward.ACCF incorporates the activity prediction mechanism,utilizing time series analysis and the EWMA mechanism,to dynamically compute the activity of network flows and accomplish real-time identification and early filtering of potential top-k flows.With respect to the accuracy loss that may be induced by hash conflicts,ACCF introduces a SRST for storing the network flow information on the evicted paths.When the eviction operation reaches the predefined MaxNumKicks value,SRST will give priority to evicting the network flow item with the lowest activity within the local scope to avoid the loss of crucial traffic information.Experimental results indicate that,under suitable parameter combinations,ACCF and SRST can filter out approximately 65% of the major flows in advance and reduce insertion operations by approximately 41%,significantly improving the accuracy in top-k traffic measurement,especially when compared with traditional algorithms such as Space Saving (SS),CM Sketch,LUSketch,and Cuckoo Counter,thereby de-monstrating distinct advantages.

Key words: Top-k,Activity,Time series,EWMA,SRST,Sketch

中图分类号: 

  • TP393
[1]ZHANG T,ZHANG Q,LEI Y,et al.Load balancing with traffic isolation in data center networks[J].Future Generation Computer Systems,2022,127:126-141.
[2]LIU S,HE M,WU Z,et al.Spatial-temporal graph neural network traffic prediction based load balancing with reinforcement learning in cellular networks[J].Information Fusion,2024,103:102079.
[3]HUANG Q,LEE P P C,BAO Y.Sketchlearn:Relieving userburdens in approximate measurement with automated statistical inference[C]//Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication.2018:576-590.
[4]ZHANG X,WANG T.Elastic and reliable bandwidth reserva-tion based on distributed traffic monitoring and control[J].IEEE Transactions on Parallel and Distributed Systems,2022,33(12):4563-4580.
[5]BISWAS R,WU J.Traffic engineering to minimize the number of rules in sdn datacenters[J].IEEE Transactions on Network Science and Engineering,2021,8(2):1467-1477.
[6]ROBIN D D,KHAN J I.P4TE:PISA switch based traffic engineering in fat-tree data center networks[J].Computer Networks,2022,215:109210.
[7]KUMAR R,VENKANNA U,TIWARI V.Optimized traffic engineering in Software Defined Wireless Network based IoT (SDWN-IoT):State-of-the-art,research opportunities and challenges[J].Computer Science Review,2023,49:100572.
[8]SHI G,SHEN X,XIAO F,et al.DANTD:A deep abnormal network traffic detection model for security of industrial internet of things using high-order features[J].IEEE Internet of Things Journal,2023,10(24):21143-21153.
[9]FERNÁNDEZ Y,SOTO J E,VERA S,et al.A streaming algo-rithm and hardware accelerator to estimate the empirical entropy of network flows[J].Computer Networks,2023,237:110035.
[10]WU M,HUANG H,SUN Y E,et al.ActiveKeeper:an accurate and efficient algorithm for finding top-k elephant flows[J].IEEE Communications Letters,2021,25(8):2545-2549.
[11]WANG H,XU H,HUANG L,et al.Fast and accurate trafficmeasurement with hierarchical filtering[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(10):2360-2374.
[12]KANNAN K,BANERJEE S.Compact TCAM:Flow entry compaction in TCAM for power aware SDN[C]//International Conference on Distributed Computing and Networking.Berlin:Springer,2013:439-444.
[13]OZERY A,DIAMANT J, FEIBISH S L.RecenTo:FindingTop-K Flows of the Recent Past[J].Proceedings of the ACM on Networking,2024,2(CoNEXT3):1-20.
[14]SHI Q,XU Y,QI J,et al.Cuckoo counter:Adaptive structure of counters for accurate frequency and top-k estimation[J].IEEE/ACM Transactions on Networking,2023,31(4):1854-1869.
[15]ZHANG B,HUANG H,SUN Y E,et al.Jigsaw-Sketch:a fast and accurate algorithm for finding top-k elephant flows in high-speed networks[J].Science China Information Sciences,2024,67(4):142101.
[16]ZHU Y,SONG Q,LUO Y.Differentially Private Top-k Flows Estimation Mechanism in Network Traffic[J].IEEE Transactions on Network Science and Engineering,2023,11(3):2462-2472.
[17]CORMODE G,MUTHUKRISHNAN S.An improved datastream summary:the count-min sketch and its applications[J].Journal of Algorithms,2005,55(1):58-75.
[18]DIMITROPOULOS X,HURLEY P,KIND A.Probabilisticlossy counting:An efficient algorithm for finding heavy hitters[J].ACM SIGCOMM Computer Communication Review,2008,38(1):5-5.
[19]MITZENMACHER M,STEINKE T,THALER J.Hierarchical heavy hitters with the space saving algorithm[C]//2012 Proceedings of the Fourteenth Workshop on Algorithm Engineering and Experiments (ALENEX).Society for Industrial and Applied Mathematics,2012:160-174.
[20]ROY P,KHAN A,ALONSO G.Augmented sketch:Faster and more accurate stream processing[C]//Proceedings of the 2016 International Conference on Management of Data.2016:1449-1463.
[21]ZHOU Y,YANG T,JIANG J,et al.Cold filter:A meta-framework for faster and more accurate stream processing[C]//Proceedings of the 2018 International Conference on Management of Data.2018:741-756.
[22]YU X,XU H,YAO D,et al.CountMax:A lightweight and cooperative sketch measurement for software-defined networks[J].IEEE/ACM Transactions on Networking,2018,26(6):2774-2786.
[23]LU J,CHEN H,ZHANG Z.LUSketch:A fast and precisesketch for top-k finding in data streams[C]//2022 International Conference on Computer Communications and Networks (ICCCN).IEEE,2022:1-10.
[24]WANG C,LI X,ZENG J,et al.Prob-CS:A Probabilistic Cuckoo Sketch for Accurate Network Traffic Measurement[J].IEEE Internet of Things Journal,2024,11(22):36965-36978.
[25]XIONG B,LIU Y,LIU R,et al.ActiveGuardian:An accurateand efficient algorithm for identifying active elephant flows in network traffic[J].Journal of Network and Computer Applications,2024,224:103853.
[26]YANG T,ZHANG H,LI J,et al.HeavyKeeper:an accurate algorithm for finding Top-k elephant flows[J].IEEE/ACM Transactions on Networking,2019,27(5):1845-1858.
[27]YANG T,GONG J,ZHANG H,et al.Heavyguardian:Separate and guard hot items in data streams[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:2584-2593.
[28]JIA P,WANG P,ZHAO J,et al.Loglog filter:Filtering colditems within a large range over high speed data streams[C]//2021 IEEE 37th International Conference on Data Engineering (ICDE).IEEE,2021:804-815.
[29]LI Y,WANG F,YU X,et al.Ladderfilter:Filtering infrequent items with small memory and time overhead[C]//Proceedings of the ACM on Management of Data.2023:1-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!