计算机科学 ›› 2020, Vol. 47 ›› Issue (7): 292-298.doi: 10.11896/jsjkx.190600156

• 信息安全 • 上一篇    下一篇

一种基于漏洞威胁模式的网络表示学习算法

黄易1,2, 申国伟1,2, 赵文波1, 郭春1,2   

  1. 1 贵州大学计算机科学与技术学院 贵阳550025
    2 贵州大学贵州省公共大数据重点实验室 贵阳550025
  • 收稿日期:2019-06-19 出版日期:2020-07-15 发布日期:2020-07-16
  • 通讯作者: 申国伟(gwshen@gzu.edu.cn)
  • 作者简介:yHuang_Addy@163.com
  • 基金资助:
    国家自然科学基金(61802081);贵州省科技重大专项计划项目(20183001);贵州省科技计划(20161052,20171051)

Network Representation Learning Algorithm Based on Vulnerability Threat Schema

HUANG Yi1,2, SHEN Guo-wei1,2, ZHAO Wen-bo1, GUO Chun1,2   

  1. 1 Department of Computer Science and Technology,Guizhou University,Guiyang 550025,China
    2 Guizhou Provincial Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,China
  • Received:2019-06-19 Online:2020-07-15 Published:2020-07-16
  • About author:HUANG Yi,born in 1997,postgraduate,is a member of China Computer Federation.Her main research interests include representation learning and network security.
    SHEN Guo-wei,born in 1986,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include cyberspace security and big data.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61802081),National Science and Technology Major Project of the Ministry of Science and Technology of Guizhou Province,China(20183001) and Guizhou Provincial Science and Technology Plan (20161052,20171051)

摘要: 威胁情报分析可为网络攻防提供有效的攻防信息,而细粒度的挖掘即网络威胁情报数据中的安全实体及实体间的关系,是网络威胁情报分析研究的热点。传统的机器学习算法,在被应用到大规模网络威胁情报数据分析中时,面临着稀疏、高维等问题,进而难以有效地捕获网络信息。为此,针对网络安全漏洞的分类问题,文中提出了一种基于漏洞威胁模式的网络表示学习算法——HSEN2vec。该算法旨在最大限度地捕获异构安全实体网络的结构和语义信息,并从中获得安全实体的低维向量表示。该算法首先基于漏洞威胁模式获取异构安全实体网络的结构信息,随后通过Skip-gram模型建模,并通过负采样技术进行有效预测进而得到最终的向量表示。实验结果表明,在国家安全漏洞数据上,与其他方法相比,利用所提算法进行漏洞分类的准确率等评价指标有所提升。

关键词: 漏洞, 网络表示学习, 威胁模式, 异构安全实体网络

Abstract: Threat intelligence analysis can provide effective attack and defense information for network attack and defense,and fine-grained mining,that is,the relationship between security entities and entities in network threat intelligence data,is a hotspot of network threat intelligence analysis research.Traditional machine learning algorithms,when applied to large-scale network threat intelligence data analysis,face sparse,high-dimensional and other issues,and thus it is difficult to effectively capture network information.To this end,a network representation learning algorithm based on vulnerability threat schema——HSEN2vec for the classification of network security vulnerabilities is proposed.The algorithm aims to capture the structure and semantic information of the heterogeneous security entity network to the maximum extent,and obtains the low-dimensional vector representation of the security entity.In the algorithm,the structural information of the heterogeneous security entity network is obtained based on the vulnerability threat schema,and then modeled by the Skip-gram model,and the effective prediction is performed by the negative sampling technique to obtain the final vector representation.The experimental results show that in the national security vulnerability data,compared with other methods,the learning algorithm proposed in this paper improves the accuracy of vulnerability classification and other evaluation indicators.

Key words: Heterogeneous security entity network, Network representation learning, Threat schema, Vulnerability

中图分类号: 

  • TP393.0
[1]YANG P A,WU Y,SU L Y,et al.Overview of Threat Intelligence Sharing Technologies in Cyberspace[J].ComputerScie-nce,2018,45(6):9-18,26.
[2]LI C,ZHOU Y.Analysis on Threat Intelligence in Big Data Environment[J].Journal of Intelligence,2017,36(9):24-30.
[3]QIN Y,SHEN G W,ZHAO W B,et al.Research on the method of network security entity recognition based on deep neural network[J].Journal of Naning University(Natural Science),2019,55(1):29-40.
[4]ZHANG Y C,WEI Q,LIU Z L,et al.Architecture of vulnerabi-
lity discovery technique for information systems[J].Journal on Communications,2011,32(2):42-47.
[5]LI J H.Overview of the technologies of threat intelligence sen-
sing,sharing and analysis in cyber space[J].Chinese Journal of Network and Information Security,2016,2(2):16-29.
[6]TU C C,YANG C,LIU Z Y,et al.Network representation
learning:an overview[J].Scientia Sinica Informationis,2017,47(8):980-996.
[7]GAO H,HUANG H.Deep Attributed Network Embedding[C]//IJCAI.2018:3364-3370.
[8]LIU Z M,MA H,LIU S X,et al.A Network Representation Learning Algorithm Fusing with Textual Attribute Information of Nodes[J].Computer Engineering,2018(11):165-171.
[9]YIN B C,WANG W T,WANG L C.Review of Deep Learning[J].Journal of Beijing University of Technology,2015,41(1):48-59.
[10]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear-
ning of social representations[C]//Proceeding of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2014:701-710.
[11]SHI C,SUN Y Z.Research Progress of Heterogeneous Network Representation Learning[J].Communications of the CCF,2018,14(3):16-20.
[12]SHI C,SUN Y Z,PHILIP S Y.Research Status And Future Development Of Heterogeneous Information Network [J].Communications of the CCF,2017,13(11):36-42.
[13]WANG X,CUI P,ZHU W W.On the Basic Problems in Network Representation Learning[J].Communications of the CCF,2018,14(3):12-15.
[14]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems.2013:3111-3119.
[15]SHEN W,HAN J,WANG J,et al.Shine+:A general frame-work for domain-specific entity linking with heterogeneous in-
formation networks[J].IEEE Transactions on Knowledge Data Engineering,2018,30(2):353-366.
[16]YANG C,LIU M,HE F,et al.Similarity Modeling on Heterogeneous Networks via Automatic Path Discovery[C]//Joint European Conference on Machine Learning and Knowledge Disco-very in Databases.Springer,2018:37-54.
[17]LIU Y F,LI R F.Graph Regularized Semi-Supervised Learning on Heterogeneous Information Networks[J].Journal of Computer Research and Development,2015,52(3):606-613.
[18]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2016:855-864.
[19]SUN Y,HAN J,YAN X.Pathsim:Meta path-based top-k similarity search in heterogeneous information networks[J].Proceedings of the VLDB Endowment,2011,4(11):992-1003.
[20]DU Y P,LIU J X,ZHANG J L.Multi-semantic Metapath Based Classification Method in Heterogeneous Information Network [J].Pattern Recognition and Artificial Intelligence,2017,30(12):1100-1107.
[21]HUANG L W,LI D Y,MA Y T,et al.A Meta Path-Based Link Prediction Model for Heterogeneous Information Networks[J].Chinese Journal of Computers,2014,37(4):848-858.
[22]DONG Y,CHAWLA N V,SWAMI A.metapath2vec:Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM:135-144.
[23]TANG J,QU M,MEI Q.Pte:Predictive text embedding
through large-scale heterogeneous text networks[C]//Procee-dings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:1165-1174.
[24]RONG X.word2vec parameter learning explained[J].arXiv:
1141.2378.
[1] 蒋成满, 华保健, 樊淇梁, 朱洪军, 徐波, 潘志中.
Python虚拟机本地代码的安全性实证研究
Empirical Security Study of Native Code in Python Virtual Machines
计算机科学, 2022, 49(6A): 474-479. https://doi.org/10.11896/jsjkx.210600200
[2] 张潆藜, 马佳利, 刘子昂, 刘新, 周睿.
以太坊Solidity智能合约漏洞检测方法综述
Overview of Vulnerability Detection Methods for Ethereum Solidity Smart Contracts
计算机科学, 2022, 49(3): 52-61. https://doi.org/10.11896/jsjkx.210700004
[3] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[4] 李明磊, 黄晖, 陆余良, 朱凯龙.
SymFuzz:一种复杂路径条件下的漏洞检测技术
SymFuzz:Vulnerability Detection Technology Under Complex Path Conditions
计算机科学, 2021, 48(5): 25-31. https://doi.org/10.11896/jsjkx.200600128
[5] 郑建云, 庞建民, 周鑫, 王军.
基于约束推导式的增强型二进制漏洞挖掘
Enhanced Binary Vulnerability Mining Based on Constraint Derivation
计算机科学, 2021, 48(3): 320-326. https://doi.org/10.11896/jsjkx.200700047
[6] 富坤, 赵晓梦, 付紫桐, 高金辉, 马浩然.
基于不完全信息的深度网络表示学习方法
Deep Network Representation Learning Method on Incomplete Information Networks
计算机科学, 2021, 48(12): 212-218. https://doi.org/10.11896/jsjkx.201000015
[7] 李毅豪, 洪征, 林培鸿.
基于深度优先搜索的模糊测试用例生成方法
Fuzzing Test Case Generation Method Based on Depth-first Search
计算机科学, 2021, 48(12): 85-93. https://doi.org/10.11896/jsjkx.200800178
[8] 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松.
基于网络表示学习的深度社团发现方法
Deep Community Detection Algorithm Based on Network Representation Learning
计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113
[9] 赵曼, 赵加坤, 刘金诺.
基于自我中心网络结构特征和网络表示学习的链路预测算法
Link Prediction Algorithm Based on Ego Networks Structure and Network Representation Learning
计算机科学, 2021, 48(11A): 211-217. https://doi.org/10.11896/jsjkx.201200231
[10] 涂良琼, 孙小兵, 张佳乐, 蔡杰, 李斌, 薄莉莉.
智能合约漏洞检测工具研究综述
Survey of Vulnerability Detection Tools for Smart Contracts
计算机科学, 2021, 48(11): 79-88. https://doi.org/10.11896/jsjkx.210600117
[11] 丁钰, 魏浩, 潘志松, 刘鑫.
网络表示学习算法综述
Survey of Network Representation Learning
计算机科学, 2020, 47(9): 52-59. https://doi.org/10.11896/jsjkx.190300004
[12] 蒋宗礼, 李苗苗, 张津丽.
基于融合元路径图卷积的异质网络表示学习
Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2020, 47(7): 231-235. https://doi.org/10.11896/jsjkx.190600085
[13] 龚扣林, 周宇, 丁笠, 王永超.
基于BiLSTM模型的漏洞检测
Vulnerability Detection Using Bidirectional Long Short-term Memory Networks
计算机科学, 2020, 47(5): 295-300. https://doi.org/10.11896/jsjkx.190800046
[14] 张虎, 周晶晶, 高海慧, 王鑫.
融合节点结构和内容的网络表示学习方法
Network Representation Learning Method on Fusing Node Structure and Content
计算机科学, 2020, 47(12): 119-124. https://doi.org/10.11896/jsjkx.190900027
[15] 顾秋阳, 琚春华, 吴功兴.
融入深度自编码器与网络表示学习的社交网络信息推荐模型
Social Network Information Recommendation Model Combining Deep Autoencoder and Network Representation Learning
计算机科学, 2020, 47(11): 101-112. https://doi.org/10.11896/jsjkx.200400120
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!