计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 105-112.doi: 10.11896/jsjkx.201000177

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于结构深度网络嵌入模型的节点标签分类算法

陈世聪1, 袁得嵛1,2, 黄淑华1,2, 杨明1,2   

  1. 1 中国人民公安大学信息网络安全学院 北京100038
    2 安全防范与风险评估公安部重点实验室 北京100038
  • 收稿日期:2020-10-29 修回日期:2021-05-23 出版日期:2022-03-15 发布日期:2022-03-15
  • 通讯作者: 袁得嵛(yuandeyu@ppsuc.edu.cn)
  • 作者简介:(chenshicong24@163.com)
  • 基金资助:
    国家社会科学基金重点项目(20AZD114);中国人民公安大学基本科研业务费项目(2021JKF215);中国人民公安大学公共安全行为科学实验室开放课题(2020SYS03);警务物联网应用技术公安部重点实验室开放课题

Node Label Classification Algorithm Based on Structural Depth Network Embedding Model

CHEN Shi-cong1, YUAN De-yu1,2, HUANG Shu-hua1,2, YANG Ming1,2   

  1. 1 School of Information and Cyber Security,People’s Public Security University of China,Beijing 100038,China
    2 Key Laboratory of Safety Precautions and Risk Assessment,Ministry of Public Security,Beijing 100038,China
  • Received:2020-10-29 Revised:2021-05-23 Online:2022-03-15 Published:2022-03-15
  • About author:CHEN Shi-cong,born in 1997,master.His main research interests include cyberspace security and law enforcement technology.
    YUAN De-yu,born in 1986,Ph.D,lecturer,Ph.D supervisor.His main research interests include cyber security and complex networks.
  • Supported by:
    National Social Science Foundation of China(20AZD114),Fundamental Research Funds for the Central Universities(2021JKF215),Open Research Fund of the Public Security Behavioral Science Laboratory,People’s Public Security University of China(2020SYS03) and Open Research Fund of Key Laboratory of the Police Internet of Things Application Technology.

摘要: 在海量数据呈现爆炸增长态势的互联网时代,传统算法已无法满足处理大规模、多类型数据的需求。近年来最新的图嵌入算法通过学习图网络特征,在链路预测、网络重构和节点分类实践中普遍取得了极佳的效果。文中基于传统自动编码器模型,创新地提出了一种融合Sdne算法与链路预测相似度矩阵的新算法,通过在反向传播过程中引入高阶损失函数,依据自编码器的新特征调整性能,改进传统算法中以单一方式判定节点相似度这一方法存在的弊端,并建立简易模型分析证明优化的合理性。对比最新研究中效果最好的Sdne算法,该算法在Micro-F1和Macro-F1两种评价指标上的提升效果均接近1%,可视化分类效果表现良好。与此同时,研究发现高阶损失函数超参的最优值大致处于1~10范围内,数值的变化依旧能够基本稳定维持整体网络的鲁棒性。

关键词: 复杂网络, 节点分类, 深度学习, 网络嵌入, 自动编码器

Abstract: In the era of Internet,where massive data is growing explosively,traditional algorithms have been unable to meet the needs of processing large-scale and multi type data.In recent years,the latest graph embedding algorithm has achieved excellent results in link prediction,network reconstruction and node classification by learning graph network characteristics.Based on the traditional automatic encoder model,a new algorithm combining Sdne algorithm and link prediction similarity matrix is proposed.By introducing a high-order loss function in the process of back-propagation,the performance is adjusted according to the new characteristics of the auto-encoder.The disadvantages of traditional algorithm in determining node similarity in a single way are improved.A simple model is established to analyze and prove the rationality of the optimization.Compared with the most effective Sdne algorithm in the latest research,the improvement effect of this algorithm on Micro-F1 and Macro-F1 two evaluation indicators is close to 1%,and the visual classification effect is good.At the same time,it is found that the optimal value of the hyperparameter of the higher-order loss function is approximately in the range of 1~10,and the change of the numerical value can basically maintain the robustness of the whole network.

Key words: Auto-encoder, Complex network, Deep learning, Network embedding, Node classification

中图分类号: 

  • TP311
[1]LIBENNOWELL D,KLEINBERY J.The link-prediction pro-blem for social networks[J].Journal of the Association for Information Science and Technology,2007,58(7):1019-1031.
[2]LV L Y,JIN C H,ZHOU T.Similarity index based on local paths for link prediction of complex network[J].Phys Reve,2009,80(4):211-223.
[3]SHEIKH N,KEFATO Z T,MONTRESOR A.Semi-Supervised Heterogeneous Information Network Embedding for Node Classification Using 1D-CNN[C]//2018 Fifth International Confe-rence on Social Networks Analysis,Management and Security (SNAMS).IEEE,2018:177-181.
[4]ZHANG D,YIN J,ZHU X,et al.Network representation lear-ning:A survey[J/OL].IEEE Transactions on Big Data.https://ieeexplore.ieee.org/document/8395024.
[5]CAI L J,XU Y B,HE T Q,et al.A New Algorithm of DeepWalk Based On Probability[J].Journal of Physics:Conference Series,2019,1069(1):130-135.
[6]PEROZZI B,AlRFOU R,SKIENA S.Deepwalk:Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2014:701-710.
[7]TANG M,ZHU L,ZOU X C.Document Vector Representation Based on Word2Vec[J].Computer Science,2016,43(6):214-217.
[8]WANG H W,WANG J L,XIE X,et al.GraphGAN:Graph representation learning with generative adversarial nets[J].arXiv,2017,30(22):11-19.
[9]ABU-EL-HAIJA S,PEROZZI B,AL-RFOU R.Learning EdgeRepresentations via Low-Rank Asymmetric Projections[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management(CIKM 17).2017:1787-1796.
[10]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2016:855-864.
[11]TANG J,QU M,WANG M,et al.Line:Large- scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.2015:1067-1077.
[12]WANG D,CUI P,ZHU W.Structural deep network embedding[C]//Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining.ACM,2016:1225-1234.
[13]LI W Z,YIN D C,YUAN D Y,et al.Particle Propagation Modelfor Dynamic Node Classification[J].IEEE Access,2020(8):140205-140215.
[14]SANJAY K,SANIDHYA C,SAKSHAM K,et al.Node Classification in Complex Networks Using Network Embedding Techniques[C]//Proceedings of the Fifth International Conference on Communication and Electronics Systems.2020:369-374.
[15]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Networks[J].Advances in Neural Information Processing Systems,2014,3:2672-2680.
[16]BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy layerwise training of deep networks[C]//International Conference on Neural Information Processing Systems.MIT Press,2006:153-160.
[17]RANZATO M,BOUREAU Y L,LECUN Y.Sparse featurelearning for deep belief networks[C]//Advances in Neural Information Processing Systems.2007:1185-1192.
[18]TOMAS N K,MAX W.Variational Graph Auto-Encoders[J].Springer,2016,28(3):61-63.
[19]BELKIN M,NIYOGI P.Laplacian eigenmaps and spectral techniques for embedding and clustering[J].Advances in Neural Information Processing Systems,2002,46(2),585-591.
[20]LEO K.A new status index derived from sociometric analysis[J].Psychometrika,1953,50(18):39-43.
[21]SALTON G,MCGILL M.Introduction to Modern InformationRetrieval[J].McGraw-Hill,1983,32(6):528-536.
[22]NEWMAN M E J.Clustering and preferential attachment in growing networks[J].Phys Rev E,2001,64(10):251-264.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[6] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[9] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[15] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!