Computer Science ›› 2025, Vol. 52 ›› Issue (8): 393-402.doi: 10.11896/jsjkx.240900085

• Information Security • Previous Articles     Next Articles

Super Spreader Detection Algorithm Based on Adaptive Sampling

SUN Jingyu1, HUANG He1, SUN Yu'e2, ZHANG Boyu1   

  1. 1 School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
    2 School of Rail Transportation,Soochow University,Suzhou,Jiangsu 215131,China
  • Received:2024-09-12 Revised:2024-12-08 Online:2025-08-15 Published:2025-08-08
  • About author:SUN Jingyu,born in 2001,postgra-duate.His main research interest is network traffic measurement.
    SUN Yu'e,born in 1983,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.28402M).Her main research interests include traffic measurement,Internet of Things,privacy presrving and crowd sensing.
  • Supported by:
    National Natural Science Foundation of China(62332013,62072322,U20A20182,62202322).

Abstract: In high-speed network environments,super-spreaders are defined as hosts or devices with a large number of connections.Accurate super spreader detection is crucial for many applications such as network monitoring,security analysis,and traffic management.Invertible algorithms based on sketches have received extensive attention due to their excellent memory efficiency and the ability to directly recover the super spreader identity from the internal structure.According to application interests,the packets sent or received from the same host or device are abstracted as a flow.The flow distribution in high-speed network is usually highly skewed,only a few flows are large flows,and the vast majority are small flows.However,the memory structure design of the existing research cannot efficiently adapt to the highly skewed flow distribution,which makes the memory resource utilization low.Therefore,this paper designs an adaptive sampling based super spreader detection algorithm AS-SSD.The algorithm proposes an adaptive sampling strategy based on register sharing,which makes up for the above shortcomings.AS-SSD first maps the elements of all flows into a register array.Small flows only consume a few registers,while larger flows consume more,which adapts to the skewed flow distribution.Then,AS-SSD uses an adaptive sampling strategy to dynamically adjust the element sampling probability of flows of different sizes,which reduces the occupation of registers by large flows and further improves the utilization efficiency of memory resources under the premise of ensuring accuracy.Experimental evaluations show that compared with the previous work,AS-SSD shows higher detection accuracy in the super spreader detection task while maintaining high throughput processing capability.Compared with the most advanced algorithms,it can increase the F1 value by more than 0.609.

Key words: Traffic measurement, Super spreader detection, High-speed network, Adaptive sampling, Sketch

CLC Number: 

  • TP391
[1]SEN S,WANG J.Analyzing peer-to-peer traffic across largenetworks[C]//Proceedings of the 2nd ACM SIGCOMM Workshop on Internet Measurement.2002:137-150.
[2]JING X,YAN Z,PEDRYCZ W.Security data collection and data analytics in the internet:A survey[J].IEEE Communications Surveys & Tutorials,2018,21(1):586-618.
[3]DURUMERIC Z,BAILEY M,HALDERMAN J A.An {Internet-Wide} view of {Internet-Wide} scanning[C]//23rd USENIX Security Symposium(USENIX Security 14).2014:65-78.
[4]SINGH S,ESTAN C,VARGHESE G,et al.Automated Worm Fingerprinting[C]//OSDI.2004.
[5]JuniperNetworks.Qfx5110 ethernet switch datasheet[EB/OL].https://www.juniper.net/us/en/products/switches/qfx-series/.
[6]YOON M K,CHEN S.Detecting stealthy spreaders by random aging streaming filters[J].IEICE Transactions Cn communications,2011,94(8):2274-2281.
[7]WANG H,MA C,ODEGBILE O O,et al.Randomized error removal for online spread estimation in data streaming[C]//Proceedings of the VLDB Endowment.2021.
[8]TANG L,XIAO Y,HUANG Q,et al.A high-performance invertible sketch for network-wide superspreader detection[J].IEEE/ACM Transactions on Networking,2022,31(2):724-737.
[9]BRUSCHI V,PONTARELLI S,TOLLET J,et al.FlowFight:High performance-low memory top-k spreader detection[J].Computer Networks,2021,196:108239.
[10]JING X,YAN Z,HAN H,et al.ExtendedSketch:Fusing network traffic for super host identification with a memory efficient sketch[J].IEEE Transactions on Dependable and Secure Computing,2021,19(6):3913-3924.
[11]HAN H,JING X,YAN Z,et al.ExtendedSketch+:Super host identification and network host trust evaluation with memory efficiency and high accuracy[J].Information Fusion,2023,92:300-312.
[12]JIA P,WANG P,ZHANG Y,et al.Accurately estimating user cardinalities and detecting super spreaders over time[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(1):92-106.
[13]CAIDA.Anonymized internet traces 2019 [EB/OL].https://catalog.caida.org/dataset/ passive_2019_pcap.
[14]CAO J,JIN Y,CHEN A,et al.Identifying high cardinality internet hosts[C]//IEEE INFOCOM 2009.IEEE,2009:810-818.
[15]KAMIYAMA N,MORI T,KAWAHARA R.Simple and adaptive identification of superspreaders by flow sampling[C]//IEEE INFOCOM 2007-26th IEEE International Conference on Computer Communications.IEEE,2007:2481-2485.
[16]SHI X,CHIU D M,LUI J C S.An online framework for cat-ching top spreaders and scanners[J].Computer Networks,2010,54(9):1375-1388.
[17]VENKATARAMAN S,SONG D X,GIBBONS P B,et al.New streaming algorithms for fast detection of superspreaders[C]//NDSS.2005:149-166.
[18]LI Y,MIAO R,KIM C,et al.{FlowRadar}:A better {NetFlow} for data centers[C]//13th USENIX Symposium on Networked Systems Design and Implementation(NSDI 16).2016:311-324.
[19]YANG T,GAO S,SUN Z,et al.Diamond sketch:Accurate per-flow measurement for bigstreaming data[J].IEEE Transactions on Parallel and Distributed Systems,2019,30(12):2650-2662.
[20]HUANG H,SUN Y E,MA C,et al.Spread estimation withnon-duplicate sampling in high-speed networks[J].IEEE/ACM Transactions on Networking,2021,29(5):2073-2086.
[21]DU Y,HUANG H,SUN Y E,et al.Self-adaptive sampling for network traffic measurement[C]//IEEE INFOCOM 2021-IEEE Conference on Computer Communications.IEEE,2021:1-10.
[22]DU Y,HUANG H,SUN Y E,et al.Short-term memory sampling for spread measurement in high-speed networks[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:470-479.
[23]KUMAR K,XU J,WANG J,et al.Spacecode bloom filter for efficient per-flow traffic measurement[C]//Infocom Twenty-Third Joint Conference of the IEEE Computer & Communications Societies.2004.
[24]DING D,SAVI M,PEDERZOLLI F,et al.In-network volumetric DDoS victim identification using programmable commodity switches[J].IEEE Transactions on Network and Service Management,2021,18(2):1191-1202.
[25]CORMODE G,MUTHUKRISHNAN S.An improved datastream summary:the count-min sketch and its applications[J].Journal of Algorithms,2005,55(1):58-75.
[26]ESTAN C,VARGHESE G,FISK M.Bitmap algorithms forcounting active flows on high speed links[C]//Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement.2003:153-166.
[27]MA C,CHEN S,ZHANG Y,et al.Super spreader identification using geometric-min filter[J].IEEE/ACM Transactions on Networking,2021,30(1):299-312.
[28]WANG P,GUAN X,QIN T,et al.A data streaming method for monitoring host connection degrees of high-speed links[J].IEEE Transactions on Information Forensics and Security,2011,6(3):1086-1098.
[29]LIU W,QU W,GONG J,et al.Detection of superpoints using a vector bloom filter[J].IEEE Transactions on Information Forensics and Security,2015,11(3):514-527.
[30]LIU Y,CHEN W,GUAN Y.Identifying high-cardinality hosts from network-wide traffic measurements[J].IEEE Transactions on Dependable and Secure Computing,2015,13(5):547-558.
[31]FLAJOLET P,FUSY É,GANDOUET O,et al.Hyperloglog:the analysis of a near-optimal cardinality estimation algorithm[C]//Discrete Mathematics & Theoretical Computer Science.2007.
[32]METWALLY A,AGRAWAL D,EL ABBADI A.Efficient computation of frequent and top-k elements in data streams[C]//International Conference on Database Theory.Berlin:Springer,2005:398-412.
[33]XIAO Q,CHEN S,CHEN M,et al.Hyper-compact virtual estimators for big network data based on register sharing[C]//Proceedings of the 2015 ACM SIGMETRICS International Confe-rence on Measurement and Modeling of Computer Systems.2015:417-428.
[1] ZHONG Yue, GU Jieming. 3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss [J]. Computer Science, 2025, 52(3): 77-85.
[2] WANG Yijie, GAO Guoju, SUN Yu'e, HUANG He. Flow Cardinality Estimation Method Based on Distributed Sketch in SDN [J]. Computer Science, 2025, 52(2): 268-278.
[3] WANG Jiayu, YU Junqing, LI Dong, ZHAO Junyang. Network Microburst Traffic Measurement Method Based on Sketch Data Structure [J]. Computer Science, 2025, 52(1): 374-382.
[4] ZHENG Haibin, LIU Xinran, CHEN Jinyin, WANG Pengcheng, WANG Xuanye. Integrity Interference Attack and Defense Methods for Network Traffic Measurement [J]. Computer Science, 2024, 51(8): 420-428.
[5] CHEN Xinyang, CHEN Hanze, ZHOU Jiasheng, HUANG Jiaqing, YU Jiashuo, ZHU Longlong, ZHANG Dong. IntervalSketch:Approximate Statistical Method for Interval Items in Data Stream [J]. Computer Science, 2024, 51(4): 4-10.
[6] MAO Chenyu, HUANG He, SUN Yu'e, DU Yang. Global Top-K Frequent Flow Measurement for Continuous Periods in Distributed Networks [J]. Computer Science, 2024, 51(4): 28-38.
[7] WU Yanni, ZHOU Zhengyan, CHEN Hanze, ZHANG Dong. RBFRadar:Detecting Remarkable Burst Flows with Programmable Data Plane [J]. Computer Science, 2024, 51(4): 48-55.
[8] JING Youxian, ZHU Qingsheng. ORB Algorithm Based on Key Point Density Optimization [J]. Computer Science, 2024, 51(11A): 240300048-5.
[9] ZHOU Bo, JIANG Peifeng, DUAN Chang, LUO Yuetong. Study on Single Background Object Detection Oriented Improved-RetinaNet Model and Its Application [J]. Computer Science, 2023, 50(7): 137-142.
[10] HAN Qiqi, LIU Xin. Application of Air-Sea Coupled Mode in High-speed Interconnection Environment [J]. Computer Science, 2023, 50(11A): 221000136-5.
[11] DOU Zhi, WANG Ning, WANG Shi-jie, WANG Zhi-hui, LI Hao-jie. Sketch Colorization Method with Drawing Prior [J]. Computer Science, 2022, 49(4): 195-202.
[12] TAN Ling-ling, YANG Fei, YI Jun-kai. Optimization Study of Sketch Algorithm Based on AVX Instruction Set [J]. Computer Science, 2021, 48(11A): 585-587.
[13] LUO Yue-tong, JIANG Pei-feng, DUAN Chang, ZHOU Bo. Small Object Detection Oriented Improved-RetinaNet Model and Its Application [J]. Computer Science, 2021, 48(10): 233-238.
[14] TIAN Wei, LIU Hao, CHEN Gen-long, GONG Xiao-hui. Cross Subset-guided Adaptive Measurement for Block Compressive Sensing [J]. Computer Science, 2020, 47(12): 190-196.
[15] LI Zong-min, LI Si-yuan, LIU Yu-jie, LI Hua. Sketch-based Image Retrieval Based on Attention Model [J]. Computer Science, 2020, 47(11): 199-204.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!