Computer Science ›› 2025, Vol. 52 ›› Issue (2): 344-352.doi: 10.11896/jsjkx.240400029

• Information Security • Previous Articles     Next Articles

Augmenter:Event-level Intrusion Detection Based on Data Provenance Graph

SUN Hongbin1,2, WANG Su3, WANG Zhiliang1,3, JIANG Zheyu1, YANG Jiahai1,3, ZHANG Hui1,2   

  1. 1 Institute for Network Sciences and Cyberspace,Tsinghua University,Beijing 100084,China
    2 Quancheng Laboratory,Ji'nan 250000,China
    3 Zhongguancun Laboratory,Beijing 100094,China
  • Received:2024-04-03 Revised:2024-08-30 Online:2025-02-15 Published:2025-02-17
  • About author:SUN Hongbin,born in 1999,postgra-duate.His main research interest is provenance-based intrusion detection.
    ZHANG Hui,born in 1973,postgra-duate.His main research interest is network measurement.
  • Supported by:
    Quancheng Laboratory(QCLZD202304-2) and Research Project of Provincial Laboratory of Shandong, China(SYS202201).

Abstract: In recent years,advanced persistent threat(APT) attacks have become increasingly prevalent.Data provenance graphs,which contain rich contextual information reflecting process execution,have shown potential for detecting APT attacks.Therefore,provenance-based intrusion detection systems(PIDS) have garnered attention.PIDS identify malicious behavior by capturing system logs to generate provenance graphs.PIDS encounter the following main challenges:efficiency,generality,and real-time capability,particularly in terms of efficiency.Current PIDS generate thousands of alerts for a single anomalous node or graph,lea-ding to a significant number of false positives,which inconveniences security personnel.This paper presents Augmenter,the first PIDS simultaneously addresses the three aforementioned challenges.Augmenter partitions processes into communities based on the information fields of nodes,effectively learning the behavior of different processes.Additionally,Augmenter introduces a time-window strategy for subgraph partitioning and employs an unsupervised feature extraction method based on graph mutual information maximization.The incremental feature extraction algorithm amplifies abnormal behavior and distinguishes it from normal behavior.Finally,Augmenter trains multiple clustering models based on process types to achieve event-level detection,allowing for more precise localization of attack behaviors.Augmenter is evaluated on the DARPA dataset,confirming its real-time performance by measuring the efficiency of the detection phase.In terms of detection efficiency,we compare the precision and recall rates with the state-of-the-art works,Kairos and ThreaTrace.Kairos achieves precision and recall rates of 0.17 and 0.80,while ThreaTrace achieves 0.29 and 0.76.In contrast,Augmenter achieves precision and recall rates of 0.83 and 0.97,demonstrating that Augmenter has significantly higher precision and detection performance.

Key words: Advanced persistent threat, Data provenance graph, Intrusion detection, Incremental feature, Abnormal behavior

CLC Number: 

  • TP393
[1]PASQUIER T,HAN X,GOLDSTEIN M,et al.Practical whole-system provenance capture[C]//Proceedings of the 2017 Symposium on Cloud Computing.2017:405-418.
[2]DONG F,LI S,JIANG P,et al.Are we there yet? an industrial viewpoint on provenance-based endpoint detection and response tools[C]//Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security.2023:2396-2410.
[3]ALTINISIK E,DENIZ F,SENCAR H T.ProvG-Searcher:AGraph Representation Learning Approach for Efficient Provenance Graph Search[C]//Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security.2023:2247-2261.
[4]CHENG Z,LV Q,LIANG J,et al.KAIROS:Practical Intrusion Detection and Investigation using Whole-system Provenance[C]//2024 IEEE Symposium on Security and Privacy(SP).2024.
[5]WANG S,WANG Z,ZHOU T,et al.Threatrace:Detecting and tracing host-based threats in node level through provenance graph learning[J].IEEE Transactions on Information Forensics and Security,2022,17:3972-3987.
[6]MANZOOR E,MILAJERDI S M,AKOGLU L.Fast memory-efficient anomaly detection in streaming heterogeneous graphs[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1035-1044.
[7]HAN X,PASQUIER T,BATES A,et al.UNICORN:Runtime Provenance-Based Detector for Advanced Persistent Threats[C]//Network and Distributed System Security Symposium.2020.
[8]WANG Q,HASSAN W U,LI D,et al.You Are What You Do:Hunting Stealthy Malware via Data Provenance Analysis[C]//Network and Distributed System Security Symposium.2020.
[9]YANG F,XU J,XIONG C,et al.{PROGRAPHER}:An Ano-maly Detection System based on Provenance Graph Embedding[C]//32nd USENIX Security Symposium.2023:4355-4372.
[10]XU Z,FANG P,LIU C,et al.Depcomm:Graph summarization on system audit logs for attack investigation[C]//2022 IEEE Symposium on Security and Privacy(SP).2022:540-557.
[11]ZENG J,CHUA Z L,CHEN Y,et al.WATSON:AbstractingBehaviors from Audit Logs via Aggregation of Contextual Semantics[C]//Network and Distributed System Security Symposium.2021.
[12]ZENGY J,WANG X,LIU J,et al.Shadewatcher:Recommendation-guided cyber threat analysis using system audit records[C]//2022 IEEE Symposium on Security and Privacy(SP).2022:489-506.
[13]REHMAN M U,AHMADI H,HASSAN W U.FLASH:A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning[C]//2024 IEEE Symposium on Security and Privacy(SP).2024:139-139.
[14]HOSSAIN M N,MILAJERDI S M,WANG J,et al.{SLEUTH}:Real-time attack scenario reconstruction from {COTS} audit data[C]//26th USENIX Security Symposium.2017:487-504.
[15]HOSSAIN M N,SHEIKHI S,SEKAR R.Combating depen-dence explosion in forensic analysis using alternative tag propagation semantics[C]//2020 IEEE Symposium on Security and Privacy(SP).2020:1139-1155.
[16]XIONG C,ZHU T,DONG W,et al.CONAN:A practical real-time APT detection system with high accuracy and efficiency[J].IEEE Transactions on Dependable and Secure Computing,2020,19(1):551-565.
[17]MILAJERDI S M,GJOMEMO R,ESHETE B,et al.Holmes:real-time apt detection through correlation of suspicious information flows[C]//2019 IEEE Symposium on Security and Privacy(SP).2019:1137-1152.
[18]MILAJERDI S M,ESHETE B,GJOMEMOR,et al.Poirot:Aligning attack behavior with kernel audit records for cyber threat hunting[C]//Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security.2019:1795-1812.
[19]HASSAN W U,BATES A,MARINO D.Tactical provenanceanalysis for endpoint detection and response systems[C]//2020 IEEE Symposium on Security and Privacy(SP).2020:1172-1189.
[20]KEROMYTIS A.Transparent computing engagement 3 data release [EB/OL].https://github.com/darpa-i2o/Transparent-Computing.
[21]HAMILTON W L,YING R,LESKOVEC J.Inductive represen-tation learning on large graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:1025-1035.
[22]BRIDGE K,SHARKEY K,COULTER D,et al.About EventTracing [EB/OL].https://learn.microsoft.com/en-us/windows/win32/etw/about-event-tracing.
[23]The Linux Audit Framework [EB/OL].https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/security_guide/chap-system_auditing.
[24]VELICKOVIC P,FEDUS W,HAMILTON W L,et al.Deepgraph infomax[C]//ICLR.2019.
[25]SCHLICHTKRULL M,KIPF T N,BLOEM P,et al.Modeling relational data with graph convolutional networks[C]//The Semantic Web:15th International Conference.2018:593-607.
[1] HE Yuankang, MA Hailong, HU Tao, JIANG Yiming, ZHANG Peng, LIANG Hao. Traffic Adversarial Example Defense Based on Feature Transfer [J]. Computer Science, 2025, 52(2): 362-373.
[2] LIU Dongqi, ZHANG Qiong, LIANG Haolan, ZHANG Zidong, ZENG Xiangjun. Study on Smart Grid AMI Intrusion Detection Method Based on Federated Learning [J]. Computer Science, 2024, 51(6A): 230700077-8.
[3] WANG Chundong, LEI Jiebin. Intrusion Detection Model Based on Combinatorial Optimization of Improved Pigeon SwarmAlgorithm [J]. Computer Science, 2024, 51(11A): 231100054-7.
[4] WANG Chundong, ZHANG Jiakai. Study on Open Set Based Intrusion Detection Method [J]. Computer Science, 2024, 51(11A): 231000033-6.
[5] FAN Yi, HU Tao, YI Peng. System Call Host Intrusion Detection Technology Based on Generative Adversarial Network [J]. Computer Science, 2024, 51(10): 408-415.
[6] ZHANG Yuxiang, HAN Jiujiang, LIU Jian, XIAN Ming, ZHANG Hongjiang, CHEN Yu, LI Ziyuan. Network Advanced Threat Detection System Based on Event Sequence Correlation Under ATT&CK Framework [J]. Computer Science, 2023, 50(6A): 220600176-7.
[7] YANG Pengfei, CAI Ruijie, GUO Shichen, LIU Shengli. Container-based Intrusion Detection Method for Cisco IOS-XE [J]. Computer Science, 2023, 50(4): 298-307.
[8] LI Haitao, WANG Ruimin, DONG Weiyu, JIANG Liehui. Semi-supervised Network Traffic Anomaly Detection Method Based on GRU [J]. Computer Science, 2023, 50(3): 380-390.
[9] BAI Wanrong, WEI Feng, ZHENG Guangyuan, WANG Baohui. Study on Intrusion Detection Algorithm Based on TCN-BiLSTM [J]. Computer Science, 2023, 50(11A): 230300142-8.
[10] LIU Jie-ling, LING Xiao-bo, ZHANG Lei, WANG Bo, WANG Zhi-liang, LI Zi-mu, ZHANG Hui, YANG Jia-hai, WU Cheng-nan. Network Security Risk Assessment Framework Based on Tactical Correlation [J]. Computer Science, 2022, 49(9): 306-311.
[11] WANG Xin-tong, WANG Xuan, SUN Zhi-xin. Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network [J]. Computer Science, 2022, 49(8): 314-322.
[12] ZHOU Zhi-hao, CHEN Lei, WU Xiang, QIU Dong-liang, LIANG Guang-sheng, ZENG Fan-qiao. SMOTE-SDSAE-SVM Based Vehicle CAN Bus Intrusion Detection Algorithm [J]. Computer Science, 2022, 49(6A): 562-570.
[13] CAO Yang-chen, ZHU Guo-sheng, SUN Wen-he, WU Shan-chao. Study on Key Technologies of Unknown Network Attack Identification [J]. Computer Science, 2022, 49(6A): 581-587.
[14] WEI Hui, CHEN Ze-mao, ZHANG Li-qiang. Anomaly Detection Framework of System Call Trace Based on Sequence and Frequency Patterns [J]. Computer Science, 2022, 49(6): 350-355.
[15] ZHANG Hong-min, LI Ping-ping, FANG Xiao-bing, LIU Hong. Human Abnormal Behavior Detection Method Based on Improved YOLOv3 Network Model [J]. Computer Science, 2022, 49(4): 233-238.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!