Computer Science ›› 2023, Vol. 50 ›› Issue (6): 29-35.doi: 10.11896/jsjkx.220800050

• High Performance Computing • Previous Articles     Next Articles

Lock-free Parallel Semi-naive Algorithm Based on Multi-core CPU

YU Ting, WANG Lisong, QIN Xiaolin   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2022-08-05 Revised:2022-11-10 Online:2023-06-15 Published:2023-06-06
  • About author:YU Ting,born in 1998,postgraduate.Her main research interests include data management and parallel algorithm.QIN Xiaolin,born in 1953,Ph.D,is a senior member of China Computer Fe-deration.His main research interests include data management and security in distributed environment and so on.
  • Supported by:
    National Natural Science Foundation of China(61802182).

Abstract: Datalog system has a wide range of applications in many fields,such as graph databases,networks,and static program analysis.When dealing with massive data,the serial-based Datalog evaluation strategy cannot fully utilize the computational resources of existing multicore processors.To address these problems,a lock-free parallelized semi-naive algorithm,parallel semi-naive(PSN) based on multi-core CPU is proposed to support efficient Datalog evaluation.PSN uses the B+ tree index to partition data and allocates data to different threads to perform calculations.The intermediate result tuples generated from each partition are different from each other,which is conducive to the realization of lock-free parallelism during calculation.PSN uses a double-layer hash table structure to index intermediate results to improve the efficiency of duplicate checking.Each thread only performs related calculations in a specific area,without using locks to prevent write conflicts.PSN adopts task queue and thread pool to allocate tasks to idle threads to achieve multi-thread load balancing.Experimental results on the public datasets of Koblenz network collection(KONECT) and Stanford network analysis platform( SNAP) show that the PSN algorithm can optimize the query performance of datalog rules.

Key words: Datalog, Semi-naive algorithm, Parallel, Recursive rule, Load balance

CLC Number: 

  • TP311
[1]CERI S,GOTTLOB G,TANCA L.What you always wanted to know about Datalog(and never dared to ask)[J].IEEE Transactions on Knowledge and Data Engineering,1989,1(1):146-166.
[2]ULLMAN J D.Principles of Database and Knowledge-Base Systems[M].Rockville,MD:Computer Science Press,1988.
[3]SHKAPSKY A,ZENG K,ZANIOLO C.Graph queries in anext-generationdatalog system[J].Proceedings of the VLDB Endowment,2013,6(12):1258-1261.
[4]FOGEL A,FUNG S,PEDROSA L,et al.A general approach to network configuration analysis[C]//12th USENIX Symposium on Networked Systems Design and Implementation(NSDI 15).2015:469-483.
[5]ALLEN N,KRISHNAN P,SCHOLZ B.Combining type-analysis with points-to analysis for analyzing Java library source-code[C]//Proceedings of the 4th ACM SIGPLAN International Workshop on State of the Art in Program Analysis.2015:13-18.
[6]HASSANSHAHI B,RAMESH R K,KRISHNAN P,et al.An efficient tunable selective points-to analysis for large codebases[C]//Proceedings of the 6th ACM SIGPLAN International Workshop on State of the Art in Program Analysis.2017:13-18.
[7]SCHOLZ B,VOROBYOV K,KRISHNAN P,et al.Adatalogsource-to-source translator for static program analysis:An experience report[C]//2015 24th Australasian Software Engineering Conference.IEEE,2015:28-37.
[8]LIVSHITS V B,LAM M S.Finding Security Vulnerabilities in Java Applications with Static Analysis[C]//USENIX Security Symposium.2005:18-18.
[9]GREEN T J,HUANG SS,LOO B T,et al.Datalog and recursive query processing[J].Foundations and Trends © in Databases,2013,5(2):105-195.
[10]SCHOLZ B,JORDAN H,SUBOTIC' P,et al.On fast large-scaleprogram analysis indatalog[C]//Proceedings of the 25th International Conference on Compiler Construction.2016:196-206.
[11]MEI H,QIN G,XU M,et al.NeuralDatalog through time:Informed temporal modeling via logical specification[C]//International Conference on Machine Learning.PMLR,2020:6808-6819.
[12]JORDAN H,SUBOTIC' P,ZHAO D,et al.A specialized B-tree for concurrentdatalog evaluation[C]//Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming.2019:327-339.
[13]MOTIK B,NENOV Y,PIRO R,et al.Maintenance ofdatalog materialisations revisited[J].Artificial Intelligence,2019,269:76-136.
[14]MARTÍNEZ-ANGELES C A,DUTRA I,COSTA V S,et al.Adatalog engine for gpus[M]//Declarative Programming and Knowledge Management.Cham:Springer,2013:152-168.
[15]ABITEBOUL S,HULL R,VIANU V.Foundations of Databa-ses[J].SIGMOD Record,2000,29(3):76-87.
[16]BRAVENBOER M,SMARAGDAKIS Y.Exception analysis and points-to analysis:Better together[C]//Proceedings of the Eighteenth International Symposium on Software Testing and Analysis.2009:1-12.
[17]HODER K,BJØRNER N,MOURA L.μZ-an efficient engine for fixed points with constraints[C]//International Conference on Computer Aided Verification.Berlin:Springer,2011:457-462.
[18]MOTIK B,NENOV Y,PIRO R,et al.Parallelmaterialisation of datalog programs in centralised,main-memory RDF systems[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2014.
[19]WHALEY J,AVOTS D,CARBIN M,et al.UsingDatalog with binary decision diagrams for program analysis[C]//Asian Symposium on Programming Languages and Systems.Berlin:Sprin-ger,2005:97-118.
[20]DONG G.On distributedprocessibility of datalog queries by decomposing databases[C]//Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data.1989:26-35.
[21]SHAO J,BELL D A,HULL M E C.Combining rule decomposition and data partitioning in paralleldatalog program processing[C]//Proceedings of the First International Conference on Pa-rallel and Distributed Information Systems.IEEE Computer So-ciety,1991:106-107.
[22]GILRAY T,KUMAR S,MICINSKI K.Compiling data-paralleldatalog[C]//Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction.2021:23-35.
[23]GANGULY S,SILBERSCHATZ A,TSUR S.Parallel bottom-up processing ofdatalog queries[J].The Journal of Logic Programming,1992,14(1/2):101-126.
[24]SEO J,PARK J,SHIN J,et al.Distributed socialite:Adatalog-based language for large-scale graph analysis[J].Proceedings of the VLDB Endowment,2013,6(14):1906-1917.
[25]WOLFSON O,SILBERSCHATZ A.Distributed processing of logic programs[C]//Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data.1988:329-336.
[26]YANG M,SHKAPSKY A,ZANIOLO C.Scaling up the per-formance of more powerfuldatalog systems on multicore machines[J].The VLDB Journal,2017,26(2):229-248.
[27]FAN Z,ZHU J,ZHANG Z,et al.Scaling-Up In-Memory Datalog Processing:Observations and Techniques[C]//Proceedings of the VLDB Endowment.2019.
[28]MCSHERRY F,MURRAY D G,ISAACS R,et al.Differential Dataflow[C]//Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research.2013.
[29]RYZHYK L,BUDIU M.Differential Datalog[C]//Proceedings of the Datalog 2.0-3rd International Workshop on the Resur-gence of Datalog in Academia and Industry.CEUR-WS,2019:56-67.
[30]LEONE N,PFEIFER G,FABER W,et al.The DLV system for knowledge representation and reasoning[J].ACM Transactions on Computational Logic(TOCL),2006,7(3):499-562.
[31]KUNEGIS J.The KONECT Project[EB/OL].http://konect.cc.
[32]LESKOVEC J,KREVL A.Stanford Large Network Dataset Collection[EB/OL].https://snap.stanford.edu/data.
[1] ZHAI Xulun, ZHANG Yongguang, JIN Anzhao, QIANG Wei, LI Mengbing. Parallel DVB-RCS2 Turbo Decoding on Multi-core CPU [J]. Computer Science, 2023, 50(6): 22-28.
[2] ZHAN Jin, WANG Xuefei, CHENG Yurong, YUAN Ye. Overview of Research on Bayesian Inference and Parallel Tempering [J]. Computer Science, 2023, 50(2): 89-105.
[3] WANG Zi-kai, ZHU Jian, ZHANG Bo-jun, HU Kai. Research and Implementation of Parallel Method in Blockchain and Smart Contract [J]. Computer Science, 2022, 49(9): 312-317.
[4] WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[5] ZONG Di-di, XIE Yi-wu. Model Medial Axis Generation Method Based on Normal Iteration [J]. Computer Science, 2022, 49(6A): 764-770.
[6] CHEN Xin, LI Fang, DING Hai-xin, SUN Wei-ze, LIU Xin, CHEN De-xun, YE Yue-jin, HE Xiang. Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture [J]. Computer Science, 2022, 49(6): 99-107.
[7] LIU Jiang, LIU Wen-bo, ZHANG Ju. Hybrid MPI+OpenMP Parallel Method on Polyhedral Grid Generation in OpenFoam [J]. Computer Science, 2022, 49(3): 3-10.
[8] ZHAO Geng, SONG Xin-yu, MA Ying-jie. Secure Data Link of Unmanned Aerial Vehicle Based on Chaotic Sub-carrier Modulation [J]. Computer Science, 2022, 49(3): 322-328.
[9] ZHU Ruo-chen, YANG Chang-chun, ZHANG Deng-hui. EGOS-DST:Efficient Schema-guided Approach to One-step Dialogue State Tracking for Diverse Expressions [J]. Computer Science, 2022, 49(11A): 210900246-7.
[10] XIE Hao-shan, LIU Xiao-nan, ZHAO Chen-yan, HE Ming, SONG Hui-chao. Classical Simulation Realization of HHL Algorithm Based on OpenMP Parallel Model [J]. Computer Science, 2022, 49(11A): 211200028-5.
[11] HUANG Jia-wei, LI Xiao-peng, LING Cheng. Scalable Parallel Computing Method for Conditional Likelihood Probability of Nucleotide Molecular Phylogenetic Tree Based on GPU [J]. Computer Science, 2022, 49(11A): 210800189-7.
[12] LIU Yan, XIONG De-yi. Construction Method of Parallel Corpus for Minority Language Machine Translation [J]. Computer Science, 2022, 49(1): 41-46.
[13] LI Jia-zhen, JI Qing-ge. Dynamic Low-sampling Ambient Occlusion Real-time Ray Tracing for Molecular Rendering [J]. Computer Science, 2022, 49(1): 175-180.
[14] FU Tian-hao, TIAN Hong-yun, JIN Yu-yang, YANG Zhang, ZHAI Ji-dong, WU Lin-ping, XU Xiao-wen. Performance Skeleton Analysis Method Towards Component-based Parallel Applications [J]. Computer Science, 2021, 48(6): 1-9.
[15] HE Ya-ru, PANG Jian-min, XU Jin-long, ZHU Yu, TAO Xiao-han. Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform [J]. Computer Science, 2021, 48(6): 34-40.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!