Computer Science ›› 2017, Vol. 44 ›› Issue (Z11): 338-341, 361.doi: 10.11896/j.issn.1002-137X.2017.11A.071

Previous Articles     Next Articles

Improved Method of Computer Virus Signature Automatic Extraction Based on N-Gram

YANG Yan and JIANG Guo-ping   

  • Online:2018-12-01 Published:2018-12-01

Abstract: With the rapid development of computer technology,security threats brought by computer virus have become more and more serious.The traditional N-Gram algorithm is difficult to capture bytes of different length,leading to the lack of effective signature and the geheration of huge signature sets,and creating a waste of storage space.Instead of using fixed-length N-Gram feature that the traditional way dose,an improved computer virus signature automatic extraction algorithm based on variable-length N-Gram was proposed to solve these problems.It extracts the effective signature to generate variable-length virus signature.Taking the correlation of signature frequency into account,the algorithm uses signature concentration to extract the N-Gram feature of malware samples and generates a data dictionary to save the storage space.In the experiment results,compared with the traditional algorithm which uses fixed-length N-Gram feature,the proposed method can effectively decrease the false rate of signature extraction.

Key words: N-Gram,Virus signature,Signature concentration,Data dictionary

[1] YEGNESWARAN V,GIFFIN J T,BARFOD P,et al.An archi-tecture for generating semantics-aware signatures[C]∥Con-ference on Usenix Security Symposium.USENIX Association,2004:7-7.
[2] LEE H,KIM W,HONG M.Biologically Inspired Computer Virus Detection System[J].Lecture Notes in Computer Science,2004,3141:153-165.
[3] KIJEWSKI P.Automated Extraction of Threat Signatures from Network Flows.http://www.first.org/conference/2006/papers/kijewski-piotr-paper.pdf.
[4] KREIBICH C,ROWCROFT J.Honeycomb:creating intrusion detection signatures using honeypots[J].Acm Sigcomm Computer Communication Review,2015,34(1):51-56.
[5] 张小康,帅建梅,史林.基于加权信息增益的恶意代码检测方法[J].计算机工程,2010,36(6):149-151.
[6] KEPHART J O,ARNOLD W C.Automatic extraction of computer virus signatures[C]∥4th Virus Bulletin International Conference.1994.
[7] 张福勇.基于n-gram词频的恶意代码特征提取方法[J].网络安全技术与应用,2015(11):88-89.
[8] 白金荣,王俊峰,赵宗渠.基于PE静态结构特征的恶意软件检测方法[J].计算机科学,2013,40(1):122-126.
[9] RAFF E,ZAK R,COX R,et al.An investigation of byte n-gram features for malware classification[J].Journal of Computer Virology & Hacking Techniques,2016:1-20.
[10] 曾键,赵辉.一种基于N-Gram的计算机病毒特征码自动提取方法[J].计算机安全,2013(10):2-5.
[11] 李沁蕾,王蕊,贾晓启.OSN中基于分类器和改进n-gram模型的跨站脚本检测方法[J].计算机应用,2014,34(6):1661-1665.
[12] DHAYA R,POONGODI M.Detecting software vulnerabilies in android using static analysis[C]∥International Conference on Advanced Communication,Control and Computing Technologies.2014.
[13] O’KANE P,SEZER S,MCLAUGHLIN K.N-gram densitybased malware detection[C]∥Computer Applications & Research.IEEE,2014:1-6.
[14] SHABTAI A,MOSKOVITCH R,FEHER C,et al.Detectingunknown malicious code by applying classification techniques on OpCode patterns[J].Security Informatics,2012,1(1):1-22.
[15] SANTOS I,BREZO F,UGARTE-PEDRERO X,et al.Opcode sequences as representation of executables for data-mining-based unknown malware detection[J].Information Sciences,2013,231(9):64-82.
[16] 吴军.数学之美[M].北京:人民邮电出版社,2012.
[17] 恶意代码网站.http://vxheaven.org.
[18] 金雄斌.计算机病毒特征码自动提取技术的研究[D].武汉:华中科技大学,2011.
[19] TANG Y,XIAO B,LU X.Using a bioinformatics approach to generate accurate exploit-based signatures for polymorphic worms[J].Computers & Security,2009,28(8):827-842.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .