计算机科学 ›› 2014, Vol. 41 ›› Issue (10): 238-243.doi: 10.11896/j.issn.1002-137X.2014.10.050

• 人工智能 • 上一篇    下一篇

基于GPU的并行化Apriori算法的设计与实现

唐家维,王晓峰   

  1. 上海海事大学信息工程学院 上海200431;上海海事大学信息工程学院 上海200431
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家海洋公益性行业专项(201305026)资助

Design and Implementation of Apriori on GPU

TANG Jia-wei and WANG Xiao-feng   

  • Online:2018-11-14 Published:2018-11-14

摘要: 大数据和高度并行的计算架构的时代已经来临,如何让传统的串行数据挖掘方法在当下获得更高的效率是一个值得探讨的问题。根据现代GPU大规模并行运算架构的特点(单结构多数据),对传统的串行Apriori算法进行并行化处理。使用最新的CUDA技术完成对传统串行Apriori算法中的支持度统计、候选集生成这两个计算的并行化实现,讨论了多种实现方法的差异,并提出改进方案。实验表明:改进后的并行算法使支持度统计在10000条事务的条件下效率提高16%,候选集生成在10000条事务的条件下效率提高25%。

关键词: 数据挖掘,关联规则,频繁模式,并行算法

Abstract: Big data and parallel computation era have come,and it is a trend to convert serial data mine algorithm into parallel algorithm to take advantage of cheap machine.In this paper two main steps,namely support counting and candidate set generation in serial apriori algorithm,were rebuilt parallelly on CUDA architecture.Meanwhile the difference between various implements of parallel apriori was compared to find a better solution.Finally,the experiments indicate that the time of support counting and candidate set generation decreases 16% and 25% respectively on a data set containing 10000 items.

Key words: Data minint,Association rules,Frequent itemset mining,Parallel agorithm

[1] Agrawal R,Srikant R.Fast algorithms for mining association rules[C]∥Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94).1994:487-499
[2] Agrawal R,Shafer J C.Parallel mining of association rules[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):962-969
[3] Shah K D,Mahajan S.Maximizing the Efficiency of ParallelApriori Algorithm[C]∥ International Conference on Advances in Recent Technologies in Communication and Computing.IEEE,2009:107-109
[4] Li Ning,Zeng Li,He Qing,et al.Parallel Implementation ofApriori Algorithm Based on MapReduce[C]∥Software Engineering,Artificial Intelligence,Networking and Parallel & Distributed Computing (SNPD).2012:236-241
[5] Shintani T,Kitsuregawa M.Hash based parallel algorithms for mining association rules[C]∥Fourth International Comperence on Parallel and Distributed Information Systems.IEEE,1996:19-30
[6] Cui Qing-min,Guo Xiao-bo.Research on Parallel AssociationRules Mining on GPU[C]∥Proceedings of the 2nd International Conference on Green Communications and Networks.2013:215-222
[7] Yang Yuan-sen,Yang Chung-ming,Hsieh T J.GPU parallelization of an object-oriented nonlinear dynamic structural analysis platform[J].Simulation Modelling Practice and Theory,2014,40:112-121
[8] Shan Feng,Hart John C.Parallel computing on geostatistical data using CUDA[C]∥IDEALS.2014
[9] Smirnov V.Parallel Integration Using OpenMP and GPU toSolve Engineering Problems[J].Applied Mechanics and Materials,2014,475:1190-1194

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!