Computer Science ›› 2024, Vol. 51 ›› Issue (4): 117-123.doi: 10.11896/jsjkx.230100018

• Database & Big Data & Data Science • Previous Articles     Next Articles

Graph Sampling Algorithm Based on Representative Node Expansion to Maintain CommunityStructure

HONG Yu1, CHEN Hongchang2, ZHANG Jianpeng2, HUANG Ruiyang2 , LI Shaomei2   

  1. 1 College of Cyberspace Security,Zhengzhou University,Zhengzhou 450000,China
    2 Institute of Information Technology,PLA Strategic Support Force Information Engineering University,Zhengzhou 450000,China
  • Received:2023-01-03 Revised:2023-12-19 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    National Natural Science Foundation of China(62002384),China Postdoctoral Science Foundation(2020M683760) and Songshan Laboratory Project(221100210700-03).

Abstract: Graph sampling is widely used in real life as a method to simplify large-scale graphs and retain specified properties.However,most of the current research focuses on preserving node-level properties,such as degree distribution,while ignoring more important information such as the community structure of graphs.To solve this problem,a graph sampling algorithm is proposed to maintain the community structure.The algorithm is divided into two steps.The first step is to initialize the community representative points,and the node importance is calculated according to the proposed node importance calculation formula,and then the representative nodes of each community are selected.The second step is to expand the community structure.For each community,it selects the node that can introduce the least additional neighbors to join the community until the upper limit of the community node is reached.Comparative experiments are conducted on a number of real data sets,and multiple evaluation indicators are adopted to evaluate the experimental results.Experimental results show that the proposed sampling algorithm can well maintain the overall community structure,and provides a feasible solution for sampling community structure of large-scale graphs.

Key words: Graph sampling, Community structure, Representative nodes, Expansion, Importance

CLC Number: 

  • TP391
[1]ROZEMBERCZKI B,KISS O,SARKAR R.Little Ball of Fur:A Python Library for Graph Sampling[C]//Proceedings of the 29th ACM International Conference on Information & Know-ledge Management.Virtual Event Ireland:Association for Computing Machinery,2020:3133-3140.
[2]AHMED N,NEVILLE J,KOMPELLA R.Network Sampling:From Static to Streaming Graphs [J].ACM Transactions on Knowledge Discovery from Data,2014,8(2):1-56.
[3]LESKOVEC U,FALOUTSOS C.Sampling from large graphs[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2006:631-636.
[4]RIBERIO B,TOWSLEY D.Estimating and sampling graphswith multidimensional random walks [C]//Internet Measurement Conference.New York:Association for Computing Machinery,2010:390-403.
[5]ZHAO Y,JIANG H J,CHEN Q A,et al.Preserving Minority Structures in Graph Sampling [J].IEEE Transactions on Visua-lization and Computer Graphics,2021,27(2):1698-1708.
[6]MAIYA A S,BERGER-WOLF T Y.Sampling communitystructure[C]//Proceedings of the 19th International World Wide Web Conference.New York:Association for Computing Machinery,2010:701-710.
[7]WANG F,CHEUNG G,WANG Y C.Low-complexity GraphSampling With Noise and Signal Reconstruction via Neumann Series[J].IEEE Transactions on Signal Processing,2019,67(21):5511-5526.
[8]JIAN B,SHI J M,ZHANG W S,et al.Graph sampling for Internet topologies using normalized Laplacian spectral features[J].Information Sciences,2019,481:574-603.
[9]ZHOU Z G,SHI C,SHEN X L,et al.Context-aware Sampling of Large Networks via Graph Representation Learning[J].IEEE Transactions on Visualization and Computer Graphics,2021,27(2):1709-1719.
[10]HU J B,DAI G H,WANG Y,et al.GraphSDH:A GeneralGraph Sampling Framework with Distribution and Hierarchy[C]//2020 IEEE High Performance Extreme Computing Conference(HPEC).Waltham:IEEE,2020:1-7.
[11]MALL R,LANGONE R,SUYKENS J A K.FURS:Fast andUnique Representative Subset selection retaining large-scale community structure[J].Social Network Analysis and Mining,2013,3(4):1075-1095.
[12]ZHANG J P,CHEN H C,YU D J,et al.Cluster-preserving sampling algorithm for large-scale graphs[J].Science China Information Sciences,2022,66:112103.
[13]ZHANG J P,PEI Y L,GEORGE F,et al.Evaluation of theSample Clustering Process on Graphs[J].IEEE Transactions on Knowledge and Data Engineering,2020,32(7):1333-1347.
[14]SANTO F,DARTO H.Community detection in networks:Auser guide[J].Physics Reports,2016,659:1-44.
[15]MARK E J N.Modularity and community structure in networks[J].Proceedings of the National Academy of Sciences,2006,103(23):8577-8582.
[1] GUO Yuxing, YAO Kaixuan, WANG Zhiqiang, WEN Liangliang, LIANG Jiye. Black-box Graph Adversarial Attacks Based on Topology and Feature Fusion [J]. Computer Science, 2024, 51(1): 355-362.
[2] WANG Junlu, LIU Qiang, ZHANG Ran, JI Wanting, SONG Baoyan. Blockchain-based Dual-branch Structure Expansion Model [J]. Computer Science, 2023, 50(8): 365-371.
[3] MENG Yiyue, PENG Rong, LYU Qibiao. Text Material Recommendation Method Combining Label Classification and Semantic QueryExpansion [J]. Computer Science, 2023, 50(1): 76-86.
[4] ZHENG Wen-ping, LIU Mei-lin, YANG Gui. Community Detection Algorithm Based on Node Stability and Neighbor Similarity [J]. Computer Science, 2022, 49(9): 83-91.
[5] XU Hui, KANG Jin-meng, ZHANG Jia-wan. Digital Mural Inpainting Method Based on Feature Perception [J]. Computer Science, 2022, 49(6): 217-223.
[6] PENG Yun-cong, QIN Xiao-lin, ZHANG Li-ge, GU Yong-xiang. Survey on Few-shot Learning Algorithms for Image Classification [J]. Computer Science, 2022, 49(5): 1-9.
[7] ZHENG Wen-ping, WANG Ning, YANG Gui. Overlapping Community Detection Algorithm Based on Local Path Information [J]. Computer Science, 2022, 49(12): 155-162.
[8] WANG Dong, XIAO Bing-bing, JIN Chen-guang, LI Zheng, LI Xiao-ruo, ZHU Bing-nan. Consensus Optimization Algorithm for Proof of Importance Based on Dynamic Grouping [J]. Computer Science, 2022, 49(12): 362-367.
[9] PAN Yu, WANG Shuai-hui, ZHANG Lei, HU Gu-yu, ZOU Jun-hua, WANG Tian-feng, PAN Zhi-song. Survey of Community Detection in Complex Network [J]. Computer Science, 2022, 49(11A): 210800144-11.
[10] HAN Xiao, ZHANG Zhe-qing, YAN Li. Temporal RDF Modeling Based on Relational Database [J]. Computer Science, 2022, 49(11): 90-97.
[11] CHEN Xiang-tao, ZHAO Mei-jie, YANG Mei. Overlapping Community Detection Algorithm Based on Subgraph Structure [J]. Computer Science, 2021, 48(9): 244-250.
[12] LI Na-na, WANG Yong, ZHOU Lin, ZOU Chun-ming, TIAN Ying-jie, GUO Nai-wang. DDoS Attack Random Forest Detection Method Based on Secondary Screening of Feature Importance [J]. Computer Science, 2021, 48(6A): 464-467.
[13] MA Yuan-yuan, HAN Hua, QU Qian-qian. Importance Evaluation Algorithm Based on Node Intimate Degree [J]. Computer Science, 2021, 48(5): 140-146.
[14] HE Bin, XU Dao-yun. Community Structure of Regular (3,4)-CNF Formula [J]. Computer Science, 2021, 48(4): 26-30.
[15] XUE Zhan-ao, SUN Bing-xin, HOU Hao-dong, JING Meng-meng. Optimal Granulation Selection Method Based on Multi-granulation Rough Intuitionistic Hesitant Fuzzy Sets [J]. Computer Science, 2021, 48(10): 98-106.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!