计算机科学 ›› 2024, Vol. 51 ›› Issue (7): 380-388.doi: 10.11896/jsjkx.230400023

• 信息安全 • 上一篇    下一篇

针对系统调用的基于语义特征的多方面信息融合的主机异常检测框架

樊燚, 胡涛, 伊鹏   

  1. 战略支援部队信息工程大学信息技术研究所 郑州 450002
  • 收稿日期:2023-04-05 修回日期:2023-08-01 出版日期:2024-07-15 发布日期:2024-07-10
  • 通讯作者: 胡涛(hutaondsc@163.com)
  • 作者简介:(fy.fzh@foxmail.com)
  • 基金资助:
    国家自然科学基金面上项目(62176264)

Host Anomaly Detection Framework Based on Multifaceted Information Fusion of SemanticFeatures for System Calls

FAN Yi, HU Tao, YI Peng   

  1. Information Technology Institute,Information Engineering University,Zhengzhou 450002,China
  • Received:2023-04-05 Revised:2023-08-01 Online:2024-07-15 Published:2024-07-10
  • About author:FAN Yi,born in 1998,postgraduate.His main research interest is host anomaly detection.
    HU Tao,born in 1993,Ph.D,assistant researcher.His main research interests include new network architecture and active cyber defense.
  • Supported by:
    National Natural Science Foundation of China(62176264).

摘要: 混淆攻击通过修改进程运行时产生的系统调用序列,可以在实现同等攻击效果的前提下,绕过主机安全防护机制的检测。现有的基于系统调用的主机异常检测方法不能对混淆攻击修改后的系统调用序列进行有效检测。针对此问题,提出了一种基于系统调用多方面语义信息融合的主机异常检测方法。从系统调用序列的多方面语义信息入手,通过系统调用语义信息抽象和系统调用语义特征提取充分挖掘系统调用序列的深层语义信息,利用多通道TextCNN实现多方面信息的融合以进行异常检测。系统调用语义抽象实现特定系统调用到其类型的映射,通过提取序列的抽象语义信息来屏蔽特定系统调用改变对检测效果的影响;系统调用语义特征提取利用注意力机制获取表征序列行为模式的关键语义特征。在ADFA-LD数据集上的实验结果表明,所提方法检测一般主机异常的误报率低于2.2%,F1分数达到0.980;检测混淆攻击的误报率低于2.8%,F1分数达到0.969,检测效果优于对比方法。

关键词: 主机异常检测, 系统调用语义信息融合, 混淆攻击, 深度学习, 注意力机制

Abstract: Obfuscation attack can bypass the detection of host security protection mechanism on the premise of achieving the same attack effect by modifying the system call sequence generated by the process running.The existing system call-based host anomaly detection methods cannot effectively detect the modified system call sequence after obfuscation attacks.This paper proposes a host anomaly detection method based on the fusion of multiple semantic information of system call.This method starts with the multiple semantic information of the system call sequence,fully mining the deep semantic information of the system call sequence through the system call semantic information abstraction and the system call semantic feature extraction,and uses the multi-channel TextCNN to realize the fusion of multiple information for anomaly detection.Semantic abstraction of system call can realize the mapping of specific system call to its type and shield the influence of specific system call change on detection effect by extracting sequence abstract semantic information.The system call semantic feature extraction uses the attention mechanism to obtain the key semantic features that represent the sequence behavior pattern.Experimental results on ADFA-LD dataset show that the false alarm rate of this method for detecting general host anomaly is lower than 2.2%,and the F1 score reaches 0.980.The false alarm rate of detecting the confusion attack is lower than 2.8%,and the F1 score reaches 0.969.Is detection performance is better than that of other methods.

Key words: Host anomaly detection, System call semantic information fusion, Obfuscation attack, Deep learning, Attention mechanism

中图分类号: 

  • TP309
[1]ROSENBERG I,GUDES E.Bypassing system calls-based intrusion detection systems[J].Concurrency and Computation:Practice and Experience,2017,29(16):e4023.
[2]TONG F,YAN Z.A hybrid approach of mobile malware detection in Android[J].Journal of Parallel and Distributed Computing,2017,103:22-31.
[3]KHATER B S,WAHAB A W B A,IDRIS M Y I B,et al.A lightweight perceptron-based intrusion detection system for fog computing[J].Applied Sciences,2019,9(1):178-199.
[4]AGHAEI E,SERPEN G.Ensemble classifier for misuse detection using N-gram feature vectors through operating system call traces[J].International Journal of Hybrid Intelligent Systems,2017,14(3):141-154.
[5]XIE M,HU J,SLAY J.Evaluating host-based anomaly detection systems:Application of the one-class SVM algorithm to ADFA-LD[C]//2014 11th International Conference on Fuzzy Systems and Knowledge Discovery(FSKD).IEEE,2014:978-982.
[6]DAS P K,JOSHI A,FININ T.App behavioral analysis usingsystem calls[C]//2017 IEEE Conference on Computer Communications Workshops(INFOCOM WKSHPS).IEEE,2017:487-492.
[7]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).Association for Computational Linguistics,2014:1746-1751.
[8]FORREST S,HOFMEYR S A,SOMAYAJI A,et al.A sense of self for unix processes[C]//Proceedings 1996 IEEE symposium on security and privacy.IEEE,1996:120-128.
[9]CREECH G,HU J.A semantic approach to host-based intrusion detection systems using contiguous and discontiguous system call patterns[J].IEEE Transactions on Computers,2013,63(4):807-819.
[10]MURTAZA S S,KHREICH W,HAMOU-LHADJ A,et al.A trace abstraction approach for host-based anomaly detection[C]//2015 IEEE Symposium on Computational Intelligence for Security and Defense Applications(CISDA).IEEE,2015:1-8.
[11]KHREICH W,KHOSRAVIFAR B,HAMOU-LHADJ A,et al.An anomaly detection system based on variable N-gram features and one-class SVM[J].Information and Software Technology,2017,91:186-197.
[12]JIANG G,CHEN H,UNGUREANU C,et al.Multiresolution abnormal trace detection using varied-Length n-grams and automata[J].IEEE Transactions on Systems,Man,and Cyberne-tics,Part C(Applications and Reviews),2006,37(1):86-97.
[13]CHEN Z L,YI P,CHEN X,et al.Real-time Anomaly Detection Framework via System Calls Based on Integrated Learning[J].Computer Engineering,2023,49(6):162-169,179.
[14]LIAO Y,VEMURI V R.Use of k-nearest neighbor classifier for intrusion detection[J].Computers & Security,2002,21(5):439-448.
[15]XIE M,HU J,YU X,et al.Evaluating host-based anomaly detection systems:Application of the frequency-based algorithms to ADFA-LD[C]//Network and System Security:8th International Conference(NSS 2014).Xi'an,China,Springer International Publishing,2014:542-549.
[16]LIU Z,JAPKOWICZ N,WANG R,et al.A statistical patternbased feature extraction method on system call traces for ano-maly detection[J].Information and Software Technology,2020,126:106348.
[17]WAGNER D,SOTO P.Mimicry attacks on host-based intrusion detection systems[C]//Proceedings of the 9th ACM Conference on Computer and Communications Security.2002:255-264.
[18]MING J,XIN Z,LAN P,et al.Replacement attacks:automatically impeding behavior-based malware specifications[C]//13th International Conference on Applied Cryptography and Network Security,ACNS 2015.Springer Verlag,2015:497-517.
[19]LE Q,MIKOLOV T.Distributed representations of sentencesand documents[C]//International Conference on Machine Learning.PMLR,2014:1188-1196.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems(NIPS'17).Curran Associates Inc.,2017:6000-6010.
[21]CREECH G,HU J.Generation of a new IDS test dataset:Time to retire the KDD collection[C]//2013 IEEE Wireless Communications and Networking Conference(WCNC).IEEE,2013:4487-4492.
[22]XIE M,HU J.Evaluating host-based anomaly detection sys-tems:A preliminary analysis of ADFA-LD[C]//2013 6th International Congress on Image and Signal Processing(CISP).IEEE,2013:1711-1716.
[23]BORISANIYA B,PATEL D.Towards virtual machine intro-spection based security framework for cloud[J].Sādhanā,2019,44:1-15.
[24]SERPEN G,AGHAEI E.Host-based misuse intrusion detection using PCA feature extraction and kNN classification algorithms[J].Intelligent Data Analysis,2018,22(5):1101-1114.
[25]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2015:2267-2273.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!