Computer Science ›› 2023, Vol. 50 ›› Issue (1): 34-40.doi: 10.11896/jsjkx.211100049

• Database & Big Data & Data Science • Previous Articles     Next Articles

Study on Big Graph Traversals for Storage Medium Optimization

JIAO Tianzhe, HE Hongyan, ZHANG Zexin, SONG Jie   

  1. School of Software College,Northeastern University,Shenyang 110819,China
  • Received:2021-11-04 Revised:2022-06-23 Online:2023-01-15 Published:2023-01-09
  • About author:JIAO Tianzhe,born in 1997,Ph.D.Her main research interests include big data management and machine learning.
    SONG Jie,born in 1980,Ph.D,professor.His main research interests include big data management,green computing and machine learning.
  • Supported by:
    National Natural Science Foundation of China(61672143).

Abstract: As the basis of many algorithms,the breadth-first search(BFS) algorithm for big graph data has attracted more and more interests industrially and academically.Among the researches on BFS for the big graph,the solid state disk(SSD) is utilized to improve algorithm performance.In the graph traversal,the storage devices are required to continuously and repeatedly load data to fulfill the demand of graph traversal.However,many operations of data wipe caused by continuously and repeatedly loading data severely degrade the lifetime of SSD.For this reason,the lifetime of SSD can be effectively extended by reducing the data for write operations.Firstly,combined with the characteristics of graph structure,the data reuse model is constructed for describing the levels of data reuse in graph traversal.Then,a heuristic priority access algorithm based on vertex degree is proposed.By judging the independence of vertexes,the proposed algorithm selects graph vertexes that are priority accessed for improving the probability of data reusing,increasing hit ratio,and reducing the wear of flash memory.This method is available to any BFS algorithms and datasets without modifying BFS algorithm and big graph data.In the end,simulation results demonstrate that the proposed model and algorithm are effective.By applying the proposed algorithm on three common BFS algorithms:BFS-4K,B40C,and Gunrock,it can effectively reduce the operations of data written in graph traversal and improve the lifetime of SSD 12%,15%,22%,respectively.

Key words: Graph traversal, Solid state disk, Cache hit rate, Heuristic, Lifetime

CLC Number: 

  • TP311
[1]PAN Y,LI Y,ZHANG H,et al.GFTL:Group-level mapping in flash translation layer to provide efficient address translation for NAND flash-based SSDs[J].IEEE Transaction on Consumer Electronics,2020,66(3):242-250.
[2]SHI H F,WANG C G,CAI R Y,et al.Node Static Processing and Caching Strategy in Space-Ground Integrated Intelligent Network[J].Computer Engineering,2021,47(1):30-36.
[3]SABET A H,ZHAO Z J,GUPTA R.Subway:Minimizing Data Transfer during out-of-GPU-Memory Graph Processing[C]//European Conference on Computer Systems.Association for Computing Machinery,2020:1-16.
[4]LIU H,HUANG H.Enterprise:Breadth-first graph traversal on GPUs[C]//High Performance Computing,Networking,Storage and Analysis.ACM,2015:1-12.
[5]HU J,JIANG H,TIAN L,et al.GC-ARM:garbage collection-aware RAM management for flash based solid state drives[C]//International Conference on Networking,Architecture and Sto-rage(NAS).IEEE,2012:134-143.
[6]PAN Y,LIN M,WU Z,et al.Caching-Aware Garbage Collection to Improve Performance and Lifetime for NAND Flash SSDs[J].IEEE Transactions on Consumer Electronics,2021,67(2):141-148.
[7]PAIK J,CHO E,JIN R,et al.Selective-Delay Garbage Collection Mechanism for Read Operations in Multichannel FlashBased Storage Devices[J].IEEE Transactions on Consumer Electronics,2018,64(1):118-126.
[8]YANG Y,MISRA V,RUBENSTEIN D.On the optimality ofgreedy garbage collection for SSDs[J].ACM SIGMETRICS Performance Evaluation Review,2015,43(2):63-65.
[9]HOUDT B V.Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data[J].Perfor-mance Evaluation,2013,70(10):692-703.
[10]PELEATO B,TABRIZI H,AGARWAL R,et al.BER-basedwear leveling and bad block management for NAND flash[C]//International Conference on Communications(ICC).IEEE,2015:295-300.
[11]CHEN Y,CHI H,YANG C,et al.Enabling the Duo-phase Data Management to Realize Longevity Bit-alterable Flash Memory[J].IEEE Transactions on Computers,2021,(16):1982-1997.
[12]YANG P,XUE N,ZHANG Y,et al.Reducing garbage collection overhead in SSD based on workload prediction[C]//Hot Topics in Storage and File Systems(HotStorage19).ACM Workshop,2019:1-8.
[13]CHENG Z J,WANG L,CHENG Y D,et al.File Access Popularity Prediction for Hierarchical Storage for High-Energy Phy-sics[J].Computer Engineering,2021,47(2):126-132.
[14]KANG D H,MIN C,EOM Y I.An Efficient Buffer Replacement Algorithm for NAND Flash Storage Devices[C]//IEEE International Symposium on Modelling,Analysis,and Simulation of Computer and Telecommunication Systems.IEEE,2014:239-248.
[15]JUNG M,CHOI W,SRIKANTAIAH S,et al.HIOS:A HostInterface I/O Scheduler for Solid State Disks[J].ACM SIGARCH Computer Architecture News,2014,42(3):289-300.
[16]JUNG M,KANDEMIR M.Sprinkler:Maximizing resource utilization in many-chip solid state disks[C]//International Symposium on High Performance Computer Architecture(HPCA'14).IEEE,2014:524-535.
[17]KANG J U,HYUN J,MAENG H,et al.The Multistreamed Solid-State Drive[C]//Hot Topics in Storage and File Systems(HotStorage).ACM Workshop,2014:13-17.
[18]YAN H,HUANG Y,ZHOU X,et al.An efficient and non-timesensitive file-aware garbage collection algorithm for NAND flash-based consumer electronics[J].IEEE Transactions on Consumer Electronics,2019,65(1):73-79.
[19]SKOURTIS D,ACHLIOPTAS D,WATKINS N,et al.Flash on Rails:Consistent Flash Performance through Redundancy[C]//Annual Technical Conference(ATC).USENIX,2014:463-474.
[20]MA R,WU F,LU Z H,et al.BlockHammer:Improving Flash Reliability by Exploiting Process Variation Aware Proactive Failure Prediction[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2020,39(12):4563-4574.
[21]SUN Q,ZHUGE Q F,CHEN X Z,et al.In-page Wear-leveling Memory Management Based on Non-volatile Memory[J].Computer Science,2018,45(S2):505-510.
[22]JOHN A,CHRISTOPHER R,NADY O,et al.Parboil:A revisedbenchmark suite for scientific and commercial throughput computing[EB/OL].(2012-03-19)[2021-10-20].http://impact.crhc.illino-is.edu/Shared/Docs/impact-12-01.parboil.pdf.
[23]SCOTT B,AYDIN B,KRSTE A,et al.Distributed memorybreadth-first search revisited:Enabling bottom-up search[C]//Parallel and Distributed Processing Symposium Workshops & PhD Forum(IPDPSW).IEEE,2013:1618-1627.
[24]LIU H,HUANG H.Enterprise:Breadth-first graph traversal on GPUs[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.ACM,2013:1-12.
[25]LI Q,LI H,ZHANG K,et al.A Survey of SSD Lifecycle Prediction[C]//International Conference on Software Engineering and Service Science(ICSESS).IEEE,2019:195-198.
[26]SALLINEN S,GHARAIBEH A,RIPEANU M.AcceleratingDirection-Optimized Breadth First Search on Hybrid Architectures[C]//European Conference on Parallel Processing.Sprin-ger International Publishing,2015:233-245.
[27]BUSATO F,BOMBIERI N.BFS-4K:an efficient implementation of BFS for Kepler GPU architectures[J].IEEE Transactions on Parallel and Distributed Systems,2015,26(7):1826-1838.
[28]MERRILL D,GARLAND M,GRIMSHAW A.Scalable GPU graph traversal[C]//ACM SIGPLAN Notices.ACM,2012:117-128.
[29]WANG Y,DAVIDSON A,PAN Y,et al.Gunrock:a high-performance graph processing library on the GPU[C]//Symposium on Principles and Practice of Parallel Programming.ACM SIGPLAN,2016:265-266.
[1] WANG Fang-hong, FAN Xing-gang, YANG Jing-jing, ZHOU Jie, WANG De-en. Strong Barrier Construction Algorithm Based on Adjustment of Directional Sensing Area [J]. Computer Science, 2022, 49(6A): 612-618.
[2] GENG Hai-jun, WANG Wei, YIN Xia. Single Node Failure Routing Protection Algorithm Based on Hybrid Software Defined Networks [J]. Computer Science, 2022, 49(2): 329-335.
[3] LIU Wen-wen, XIONG Wei, HAN Chi. Communication Satellite Task Relaxation Scheduling Method Based on Improved Hyper-heuristic Algorithm [J]. Computer Science, 2022, 49(11A): 210900125-6.
[4] CHENG Qiu-yun, LIU Jing-ju, YANG Guo-zheng, LUO Zhi-hao. Heuristic Method for Building Internet Multilayer Network Model [J]. Computer Science, 2022, 49(11A): 210800249-6.
[5] YU Sai-sai, WANG Xiao-juan, ZHANG Qian-qian. Detection of Malicious Behavior in Encrypted Traffic Based on Heuristic Search Feature Selection [J]. Computer Science, 2022, 49(11A): 210800237-6.
[6] LIU Zhong-hui, ZHAO Qi, ZOU Lu, MIN Fan. Heuristic Construction of Triadic Concept and Its Application in Social Recommendation [J]. Computer Science, 2021, 48(6): 234-240.
[7] GUO Biao, TANG Qi, WEN Zhi-min, FU Juan, WANG Ling, WEI Ji-bo. List-based Software and Hardware Partitioning Algorithm for Dynamic Partial Reconfigurable System-on-Chip [J]. Computer Science, 2021, 48(6): 19-25.
[8] ZHANG Xiao, ZHANG Si-meng, SHI Jia, DONG Cong, LI Zhan-huai. Review on Performance Optimization of Ceph Distributed Storage System [J]. Computer Science, 2021, 48(2): 1-12.
[9] GUO Qi-cheng, DU Xiao-yu, ZHANG Yan-yu, ZHOU Yi. Three-dimensional Path Planning of UAV Based on Improved Whale Optimization Algorithm [J]. Computer Science, 2021, 48(12): 304-311.
[10] TANG Wen-jun, LIU Yue, CHEN Rong. User Allocation Approach in Dynamic Mobile Edge Computing [J]. Computer Science, 2021, 48(1): 58-64.
[11] GUO Fei-yan, TANG Bing. Mobile Edge Server Placement Method Based on User Latency-aware [J]. Computer Science, 2021, 48(1): 103-110.
[12] FENG Bing-chao and WU Jing-li. Partheno-genetic Algorithm for Solving Static Rebalance Problem of Bicycle Sharing System [J]. Computer Science, 2020, 47(6A): 114-118.
[13] PAN Heng, LI Jing feng, MA Jun hu. Role Dynamic Adjustment Algorithm for Resisting Insider Threat [J]. Computer Science, 2020, 47(5): 313-318.
[14] ZHANG Xu, WANG Li-li, YANG Bo-tao. Heuristic Algorithms for Two-dimensional Irregular Bin Packing Problem with GuillotineConstraints [J]. Computer Science, 2020, 47(5): 212-216.
[15] YANG Ting, LUO Fei, DING Wei-chao, LU Hai-feng. Bin Packing Algorithm Based on Adaptive Optimization of Slack [J]. Computer Science, 2020, 47(4): 211-216.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!