计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 358-366.doi: 10.11896/jsjkx.241000083

• 信息安全 • 上一篇    下一篇

基于异构图归纳学习的恶意域名检测研究

梁建鹏1, 莫秀良1, 王鹏翔2, 王焕然3, 王春东4   

  1. 1 天津理工大学计算机科学与工程学院 天津 300384
    2 白俄罗斯国立大学 明斯克 220030
    3 哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001
    4 天津公安警官职业学院 天津 300382
  • 收稿日期:2024-10-17 修回日期:2025-06-10 出版日期:2025-12-15 发布日期:2025-12-09
  • 通讯作者: 莫秀良(moxiuliang@163.com)
  • 作者简介:(2947279118@qq.com)
  • 基金资助:
    国家自然科学基金重点项目(61931019)

Research on Malicious Domain Detection Based on Heterogeneous Graph Inductive Learning

LIANG Jianpeng1, MO Xiuliang1, WANG Pengxiang2, WANG Huanran3, WANG Chundong4   

  1. 1 Schoolof Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384, China
    2 Belarusian State University, Minsk 220030, The Republic of Belarus
    3 College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    4 Tianjin Public Security Police Profession College, Tianjin 300382, China
  • Received:2024-10-17 Revised:2025-06-10 Published:2025-12-15 Online:2025-12-09
  • About author:LIANG Jianpeng,born in 2000,master,is a member of CCF(No.V7671G).His main research interest is malicious domain detection.
    MO Xiuliang,born in 1969,postgra-duate,associate professor.His main research interests include information security and network security.
  • Supported by:
    This work was supported by the State Key Program of National Natural Science Foundation of China(61931019).

摘要: 当前基于图神经网络的恶意域名检测技术需要依赖领域专家进行元路径选择,才能将异构图转换为同构图进行直推式学习。这种方法难以利用图中丰富的拓扑信息,不具有良好的扩展性和泛化能力。对此,提出一种基于异构图归纳学习的恶意域名检测技术。首先,利用元路径生成算法构建以域名、主机和域名注册信息为节点的异质信息网络。其次,为克服直推式训练方式下的模型在真实网络中适用能力差的问题,使用归纳式图神经网络HeteroGAT来学习由训练样本构成的异构图的通用结构,并利用基于自编码器的域名特征表示来提升检测性能。最后,在公开数据集上将所提算法与机器学习和深度学习方法进行了对比。实验结果显示,所提出的方法取得了更优的性能指标,且在训练样本较少的条件下依旧能够有效处理数据不平衡问题,具有良好的鲁棒性。

关键词: 网络安全, 恶意域名, 归纳式学习, 异构图, 元路径

Abstract: Current malicious domain detection techniques based on graph neural networks rely on domain experts for meta-path selection to convert heterogeneous graphs into homogeneous graphs for direct learning.This approach struggles to leverage the rich topological information within the graph and lacks good scalability and generalization capabilities.For this issue,this paper proposes a malicious domain detection technique based on inductive learning from heterogeneous graphs.Firstly,it constructs a heterogeneous information network with nodes representing domains,hosts,and domain registration information using a meta-path generation algorithm.Secondly,to address the model’s poor applicability in real networks under direct training,it utilizes the inductive graph neural network HeteroGAT to learn the general structure of the heterogeneous graph formed by training samples and enhances detection performance through an autoencoder-based domain feature representation.Finally,it compares the proposed algorithm with machine learning and deep learning methods on public datasets.Experimental results demonstrate that the proposed method achieves superior performance metrics and effectively handles data imbalance even with a limited number of training samples,showing strong robustness.

Key words: Network security, Malicious domain detection, Inductive learning, Heterogeneous graph, Meta-path

中图分类号: 

  • TP393.08
[1]YADAV S,REDDY A K K,REDDY A L N,et al.Detecting algorithmically generated domain-flux attacks with DNS traffic analysis[J].IEEE/ACM Transactions on Networking,2012,20(5):1663-1677.
[2]MANADHATA P,YADAV S,RAO P,et al.Detecting mali-cious domains via graph inference[C]//Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop.2014:59-60.
[3]KHALIL I,YU T,GUAN B.Discovering malicious domainsthrough passive DNS data graph analysis[C]//Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security.2016:663-674.
[4]SUN X,TONG M,YANG J,et al.{HinDom}:A robust malicious domain detection system based on heterogeneous information network with transductive classification[C]//22nd International Symposium on Research in Attacks,Intrusions and Defenses(RAID 2019).2019:399-412.
[5]SUN X,WANG Z,YANG J,et al.Deepdom:Malicious domain detection with scalable and heterogeneous graph convolutional networks[J].Computers & Security,2020,99:102057.
[6]YUN S,JEONG M,YOO S,et al.Graph Transformer Net-works:Learning meta-path graphs to improve GNNs[J].Neural Networks,2022,153:104-119.
[7]BILGE L,SEN S,BALZAROTTI D,et al.Exposure:A passive dns analysis service to detect and report malicious domains[J].ACM Transactions on Information and System Security,2014,16(4):1-28.
[8]PALANIAPPAN G,SANGEETHA S,RAJENDRAN B,et al.Malicious domain detection using machine learning on domain name features,host-based features and web-based features[J].Procedia Computer Science,2020,171:654-661.
[9]LIU Z,ZENG Y,ZHANG P,et al.An imbalanced malicious do-mains detection method based on passive DNS traffic analysis[J].Security and Communication Networks,2018,2018(1):6510381.
[10]PARK K H,SONG H M,DO YOO J,et al.Unsupervised malicious domain detection with less labeling effort[J].Computers &Security,2022,116:102662.
[11]REN F,JIANG Z,WANG X,et al.A DGA domain names detection modeling method based on integrating an attention mechanism and deep neural network[J].Cybersecurity,2020,3(1):4.
[12]WEI J X,LONG C,FU H,et al.Malicious domain name detection method based on enhanced embedded feature hypergraph learning[J].Journal of Computer Research and Development,2024,61(9):2334-2346.
[13]YUAN J T,LIU Y P,YU L.A novel approach for maliciousURL detection based on the joint model[J].Security and Communication Networks,2021,2021(1):4917016.
[14]LEI K,FU Q,NI J,et al.Detecting malicious domains with behavioral modeling and graph embedding[C]//IEEE 39th International Conference on Distributed Computing Systems(ICDCS 2019).IEEE,2019:601-611.
[15]LI Y,LUO X,WANG L,et al.DyDom:Detecting malicious domains with spatial-temporal analysis on dynamic graphs[C]//2021 IEEE 23rd Int Conf on High Performance Computing & Communications;7th Int Conf on Data Science & Systems;19th Int Conf on Smart City;7th Int Conf on Dependability in Sensor,Cloud & Big Data Systems & Application(HPCC/DSS/SmartCity/DependSys).IEEE,2021:283-290.
[16]ZHANG Z,ZHANG S F,YANG W,et al.Malicious DomainName Detection Method Based on Graph Contrastive Learning[J].Ruan Jian Xue Bao/Journal of Software,2024,35(10):4837-4858.
[17]WANG Q,DONG C,JIAN S,et al.HANDOM:Heterogeneous attention network model for malicious domain detection[J].Computers & Security,2023,125:103059.
[18]NG A.Sparse autoencoder[J].CS294A Lecture Notes,2011,72(2011):1-19.
[19]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[20]VELICKOVIC P,CUCURULL G,CASANOVA A,et al.Graph attention networks[C]//ICLR 2018.2018.
[21]YAO Y,FAN Z S,WANG Q,et al.Malicious Domain Detection Method Based on Multivariate Time-Series Features[J].Netinfo Security,2023,23(11):1-8.
[22]WANG X,JI H,SHI C,et al.Heterogeneous graph attention network[C]//The World Wide Web Conference.2019:2022-2032.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!