面向并发图分析的局部性感知的缓存管理策略

doi:10.11896/jsjkx.250200062

Abstract

Abstract: With the rapid growth of graph computing,modern graph platforms routinely execute a large number of concurrent graph analytics tasks to extract the latent value in massive datasets.Consequently,concurrent graph processing has been widely adopted in domains,including intelligent education,public administration,and news media.However,most existing graph proces-sing systems are originally designed for single-task execution and suffer from excessive redundant data accesses when handling concurrent workloads.Although prior studies have observed significant redundancy in in-memory graph data across concurrent tasks and have attempted to exploit temporal and spatial locality to share underlying graph data,they largely overlook the data locality in private state updates.This limitation leads to low cache utilization and,ultimately,degraded system throughput.To address this challenge,this paper proposes CCG,a locality-aware cache management strategy for concurrent graph analysis,which fully exploits both temporal and spatial locality across tasks to reduce redundant data accesses and synchronization overhead.Specifically,CCG efficiently buffers and incrementally merges redundant updates,leveraging data locality to perform high-throughput batch updates in memory.This design minimizes access costs,mitigates cache thrashing,and significantly improves concurrency performance.Moreover,CCG employs a multi-level cache hierarchy to enable layered buffering and merging,thereby eliminating synchronization and locking overhead during private state updates.Experimental results show that CCG improves system throughput by 2.3×～7.8× over GRASP.

Key words: Concurrent graph analysis, Cache architecture, Data locality, Cache thrashing, Throughput

CLC Number:

TP391

LI Hanqiao, ZHAO Yuanjun. Locality-aware Cache Management Strategy for Concurrent Graph Analysis[J].Computer Science, 2025, 52(12): 125-132.

References

[1]PAGE L,BRIN S,MOTWANI R,et al.The PageRank citation ranking:bringing order to the web[R].Stanford:Stanford InfoLab,1998.
[2]ZHANG Y,GAO Q,GAO L,et al.Maiter:An asynchronousgraph processing framework for delta-based accumulative iterative computation[J].IEEE Transactions on Parallel and Distri-buted Systems,2013,25(8):2091-2100.
[3]SHUN J,BLELLOCH G E.Ligra:a lightweight graph processing framework for shared memory[C]//Proceedings of the 2013 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.2013:135-146.
[4]QIAO S J,GUO J,HAN N,et al.Parallel Algorithm for Discovering Communities in Large-Scale Complex Netwoks[J].Chinese Journal of Computers,2017,40(3):687-700.
[5]YANG K,ZHANG M X,CHEN K,et al.Knightking:a fast distributed graph random walk engine[C]//Proceedings of the 2019 ACM Symposium on Operating Systems Principles.2019:524-537.
[6]ZHANG Y,LIAO X,GU L,et al.AsynGraph:Maximizing data parallelism for efficient iterative graph processing on GPUs[J].ACM Transactions on Architecture and Code Optimization(TACO),2020,17(4):1-21.
[7]LOW Y,GONZALEZ J,KYROLA A,et al.Distributed gra-phlab:A framework for machine learning in the cloud[J].ar-Xiv:1204.6078,2012.
[8]XUE J,YANG Z,QU Z,et al.Seraph:an efficient,low-cost system for concurrent graph processing[C]//Proceedings of the 2014 International Symposium on High-performance Parallel and Distributed Computing.2014:227-238.
[9]ZHANG Y,LIAO X,JIN H,et al.{CGraph}:A correlations-aware approach for efficient concurrent iterative graph processing[C]//Proceedings of the 2018 USENIX Annual Technical Conference.2018:441-452.
[10]ZHAO J,ZHANG Y,LIAO X,et al.GraphM:an efficient sto-rage system for high throughput of concurrent graph processing[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2019:1-14.
[11]CHEN H,SHEN M,XIAO N,et al.Krill:a compiler and runtime system for concurrent graph processing[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2021:1-16.
[12]YIN X,ZHAO Z,GUPTA R.Glign:Taming misaligned graphtraversals in concurrent graph processing[C]//Proceedings of the 2022 ACM International Conference on Architectural Support for Programming Languages and Operating Systems.2022:78-92.
[13]PARK J S,PENNER M,PRASANNA V K.Optimizing graph algorithms for improved cache performance[J].IEEE Transactions on Parallel and Distributed Systems,2004,15(9):769-782.
[14]SAFRO I,TEMKIN B.Multiscale approach for the networkcompression-friendly ordering[J].Journal of Discrete Algorithms,2011,9(2):190-202.
[15]BANERJEE J,KIM W,KIM S J,et al.Clustering a DAG for CAD Databases[J].IEEE Transactions on Software Enginee-ring,1988,14(11):1684-1699.
[16]BALAJI V,CRAGO N,JALEEL A,et al.P-opt:Practical optimal cache replacement for graph analytics[C]//Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture.IEEE,2021:668-681.
[17]JALEEL A,THEOBALD K B,STEELY JR S C,et al.Highperformance cache replacement using re-reference interval prediction(RRIP)[J].ACM SIGARCH Computer Architecture News,2010,38(3):60-71.
[18]FALDU P,DIAMOND J,GROT B.Domain-specialized cachemanagement for graph analytics[C]//Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture.IEEE,2020:234-248.
[19]HAM T J,WU L,SUNDARAM N,et al.Graphicionado:Ahigh-performance and energy-efficient accelerator for graph analytics[C]//Proceedings of the 2016 Annual IEEE/ACM International Symposium on Microarchitecture.IEEE,2016:1-13.
[20]NGUYEN D,LENHARTH A,PINGALI K.A lightweight infrastructure for graph analytics[C]//Proceedings of the 2013 ACM Symposium on Operating Systems Principles.2013:456-471.
[21]SANCHEZ D,KOZYRAKIS C.ZSim:Fast and accurate microarchitectural simulation of thousand-core systems[J].ACM SIGARCH Computer architecture news,2013,41(3):475-486.

Related Articles 15

[1]	LONG Tie, XIAO Fu, FAN Weibei, HE Xin, WANG Junchang. Cubic+:Enhanced Cubic Congestion Control for Cross-datacenter Networks [J]. Computer Science, 2025, 52(8): 335-342.
[2]	DI Jianqiang, YUAN Liang, ZHANG Yunquan, ZHANG Sijia. Performance Optimization of Complex Stencil in Weather Forecast Model WRF [J]. Computer Science, 2024, 51(4): 56-66.
[3]	LIAO Qihua, NIE Kai, HAN Lin, CHEN Mengyao, XIE Wenbing. Tile Selection Algorithm Based on Data Locality [J]. Computer Science, 2024, 51(12): 100-109.
[4]	ZHAI Xulun, ZHANG Yongguang, JIN Anzhao, QIANG Wei, LI Mengbing. Parallel DVB-RCS2 Turbo Decoding on Multi-core CPU [J]. Computer Science, 2023, 50(6): 22-28.
[5]	ZHAI Jia-qi, LI Bin, ZHOU Qing-lei, CHEN Xiao-jie. Implementation of FPGA-based High-performance and Scalable SM4-GCM Algorithm [J]. Computer Science, 2022, 49(10): 74-82.
[6]	ZHANG Fan, GONG Ao-yu, DENG Lei, LIU Fang, LIN Yan, ZHANG Yi-jin. Wireless Downlink Scheduling with Deadline Constraint for Realistic Channel Observation Environment [J]. Computer Science, 2021, 48(9): 264-270.
[7]	HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong. Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation [J]. Computer Science, 2021, 48(12): 49-58.
[8]	SONG Ying, ZHONG Xian, SUN Bao-lin, GUI Chao. Sliding Window-based Network Coding Cooperative Algorithm in MANET [J]. Computer Science, 2020, 47(11): 322-326.
[9]	JI Bao-feng, WANG Yi-dan, XING Bing-bing, LI Yu-qi, GAO Hong-feng, HAN Cong-cheng. Enhancement Method of Throughput in Ultra-dense Network Based on Hierarchical Multi-hop Physical Layer Network Coding [J]. Computer Science, 2019, 46(7): 56-60.
[10]	ZHOU Wei-xing, SHI Hai-he. Survey on Sequence Assembly Algorithms in High-throughput Sequencing [J]. Computer Science, 2019, 46(5): 36-43.
[11]	YAO Xin-wei, ZHANG Meng-na, WANG Wan-liang, YANG Shuang-hua. Optimal Energy Allocation Algorithm with Energy Harvesting and Hybrid Energy Storage for Microscale Wireless Networks [J]. Computer Science, 2018, 45(8): 75-79.
[12]	ZHANG Yun-chun, LI Long-bao, YAO Shao-wen, HU Jian-tao, ZHANG Chen-bin. Sandpile Model Based Load Balancing Algorithm in Wireless Mesh Networks [J]. Computer Science, 2018, 45(8): 84-87.
[13]	CHI Kai-kai ,WEI Xin-chen, LIN Yi-min. High-throughput and Load-balanced Node Access Scheme for RF-energy Harvesting Wireless Sensor Networks [J]. Computer Science, 2018, 45(8): 119-124.
[14]	CHI Kai-kai, XU Xin-chen, WEI Xin-chen. Minimal Base Stations Deployment Scheme Satisfying Node Throughput Requirement in Radio Frequency Energy Harvesting Wireless Sensor Networks [J]. Computer Science, 2018, 45(6A): 332-336.
[15]	CHI Kai-kai, LIN Yi-min, LI Yan-jun, CHENG Zhen. Duty Cycle Scheme Maximizing Throughput in Energy Harvesting Sensor Networks [J]. Computer Science, 2018, 45(6): 100-104.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Locality-aware Cache Management Strategy for Concurrent Graph Analysis

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0