• 数据库&大数据&数据科学 •

### 基于代表性节点扩张的保持社区结构的图采样算法

1. 1 郑州大学网络空间安全学院 郑州450000
2 中国人民解放军战略支援部队信息工程大学信息技术研究所 郑州450000
• 收稿日期:2023-01-03 修回日期:2023-12-19 出版日期:2024-04-15 发布日期:2024-04-10
• 通讯作者: 张建朋(j_zhang_edu@sina.com)
• 作者简介:(1161738832@qq.com)
• 基金资助:
国家自然科学基金(62002384);中国博士后科学基金面上项目(2020M683760);嵩山实验室项目(221100210700-03)

### Graph Sampling Algorithm Based on Representative Node Expansion to Maintain CommunityStructure

HONG Yu1, CHEN Hongchang2, ZHANG Jianpeng2, HUANG Ruiyang2 , LI Shaomei2

1. 1 College of Cyberspace Security,Zhengzhou University,Zhengzhou 450000,China
2 Institute of Information Technology,PLA Strategic Support Force Information Engineering University,Zhengzhou 450000,China
• Received:2023-01-03 Revised:2023-12-19 Online:2024-04-15 Published:2024-04-10
• Supported by:
National Natural Science Foundation of China(62002384),China Postdoctoral Science Foundation(2020M683760) and Songshan Laboratory Project(221100210700-03).

Abstract: Graph sampling is widely used in real life as a method to simplify large-scale graphs and retain specified properties.However,most of the current research focuses on preserving node-level properties,such as degree distribution,while ignoring more important information such as the community structure of graphs.To solve this problem,a graph sampling algorithm is proposed to maintain the community structure.The algorithm is divided into two steps.The first step is to initialize the community representative points,and the node importance is calculated according to the proposed node importance calculation formula,and then the representative nodes of each community are selected.The second step is to expand the community structure.For each community,it selects the node that can introduce the least additional neighbors to join the community until the upper limit of the community node is reached.Comparative experiments are conducted on a number of real data sets,and multiple evaluation indicators are adopted to evaluate the experimental results.Experimental results show that the proposed sampling algorithm can well maintain the overall community structure,and provides a feasible solution for sampling community structure of large-scale graphs.

• TP391
 [1]ROZEMBERCZKI B,KISS O,SARKAR R.Little Ball of Fur:A Python Library for Graph Sampling[C]//Proceedings of the 29th ACM International Conference on Information & Know-ledge Management.Virtual Event Ireland:Association for Computing Machinery,2020:3133-3140.[2]AHMED N,NEVILLE J,KOMPELLA R.Network Sampling:From Static to Streaming Graphs [J].ACM Transactions on Knowledge Discovery from Data,2014,8(2):1-56.[3]LESKOVEC U,FALOUTSOS C.Sampling from large graphs[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2006:631-636.[4]RIBERIO B,TOWSLEY D.Estimating and sampling graphswith multidimensional random walks [C]//Internet Measurement Conference.New York:Association for Computing Machinery,2010:390-403.[5]ZHAO Y,JIANG H J,CHEN Q A,et al.Preserving Minority Structures in Graph Sampling [J].IEEE Transactions on Visua-lization and Computer Graphics,2021,27(2):1698-1708.[6]MAIYA A S,BERGER-WOLF T Y.Sampling communitystructure[C]//Proceedings of the 19th International World Wide Web Conference.New York:Association for Computing Machinery,2010:701-710.[7]WANG F,CHEUNG G,WANG Y C.Low-complexity GraphSampling With Noise and Signal Reconstruction via Neumann Series[J].IEEE Transactions on Signal Processing,2019,67(21):5511-5526.[8]JIAN B,SHI J M,ZHANG W S,et al.Graph sampling for Internet topologies using normalized Laplacian spectral features[J].Information Sciences,2019,481:574-603.[9]ZHOU Z G,SHI C,SHEN X L,et al.Context-aware Sampling of Large Networks via Graph Representation Learning[J].IEEE Transactions on Visualization and Computer Graphics,2021,27(2):1709-1719.[10]HU J B,DAI G H,WANG Y,et al.GraphSDH:A GeneralGraph Sampling Framework with Distribution and Hierarchy[C]//2020 IEEE High Performance Extreme Computing Conference(HPEC).Waltham:IEEE,2020:1-7.[11]MALL R,LANGONE R,SUYKENS J A K.FURS:Fast andUnique Representative Subset selection retaining large-scale community structure[J].Social Network Analysis and Mining,2013,3(4):1075-1095.[12]ZHANG J P,CHEN H C,YU D J,et al.Cluster-preserving sampling algorithm for large-scale graphs[J].Science China Information Sciences,2022,66:112103.[13]ZHANG J P,PEI Y L,GEORGE F,et al.Evaluation of theSample Clustering Process on Graphs[J].IEEE Transactions on Knowledge and Data Engineering,2020,32(7):1333-1347.[14]SANTO F,DARTO H.Community detection in networks:Auser guide[J].Physics Reports,2016,659:1-44.[15]MARK E J N.Modularity and community structure in networks[J].Proceedings of the National Academy of Sciences,2006,103(23):8577-8582.
Viewed
Full text

Abstract

Cited

Shared
Discussed