Computer Science ›› 2015, Vol. 42 ›› Issue (10): 208-210.

Previous Articles     Next Articles

Improved Apriori Algorithm Based on Bigtable and MapReduce

WEI Ling, WEI Yong-jiang and GAO Chang-yuan   

  • Online:2018-11-14 Published:2018-11-14

Abstract: BM-Apriori algorithm was designed for big data to address the poor efficiency problem of Apriori in mining frequent item sets.BM-Apriori takes advantages of Bigtable and MapReduce together to optimize Apriori algorithm.Compared with the improved Apriori algorithm simply based on MapReduce model,timestamp of Bigtable is utilized in this algorithm to avoid generating a large number of key/value pairs.It saves the pattern matching time and scans the database only once.Also,to obtain transaction marks automatically,transaction mark column is added to set list for computing support numbers.BM-Apriori was executed on Hadoop platform.The experimental results show that BM-Apriori has higher efficiency and scalability.

Key words: Apriori algorithm,Bigtable,MapReduce,Big data

[1] Hajian S,Domingo-Ferrer J.A methodology for direct and indi-rect discrimination prevention in data mining [J].IEEE Transactions on Knowledge and Data Engineering,2013,25(7):1445-1459
[2] Lara J,Lizcano D,Martinez A,et al.A UML profile for the conceptual modeling of structurally complex data:Easing human effort in the KDD process [J].Information and Software Techno-logy,2014,56(3):335-351
[3] Agrawal R,Imielinski T,Swami A.Database mining:a performance perspective [J].IEEE Transactions on Knowledge and Data Engineering,1993,5(6):914-925
[4] 张震,汪斌强,陈庶樵,等.基于多维计数型布鲁姆过滤器的大流检测机制[J].电子与信息学报,2010,32(7):1608-1613 Zhang Zhen,Wang Bin-qiang,Chen Shu-qiao,et al.A Mechanism of Identifying Heavy Hitters Based on Multi-dimensional Counting Bloom Filter[J].Journal of Electronics & Information Technology,2010,32(7):1608-1613
[5] Wang B L,Shen Y G.Improvement of Apriori algorithm based on boolean matrix [J].Advanced Materials Research,2011,159:144-148
[6] 罗丹,李陶深.一种基于压缩矩阵的Apriori算法改进研究[J].计算机科学,2013,40(12):75-78 Luo Dan,Li Tao-shen.Research on improved Apriori algorithm based on matrix compression [J].Computer Science,2013,40(12):75-78
[7] 李晓虹,尚晋.一种改进的新Apriori算法[J].计算机科学,2007,32(4):196-197 Li Xiao-hong,Shang Jin.An improved Apriori algorithm[J].Computer Science,2007,32(4):196-197
[8] Grudzinski P,Wojciechowski M.Integration of candidate hashtrees in concurrent processing of frequent itemset queries using Apriori[J].Control and Cybernetics,2009,38(1):47-65
[9] Jongwook W.Market Basket Analysis algorithms with MapReduce[J].Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery,2013,3(6):445-452
[10] Karim R,Hossain A,Rashid M,et al.A MapReduce Framework for Mining Maximal Contiguous Frequent Patterns in Large DNA Sequence Datasets [J].IETE Techical Review,2012,29(2):162-168
[11] Chang F,Dean J,Ghemawat S,et al.Bigtable:A distributedstorage system for structured data [J].ACM Transactions on Computer Systems,2008,46(2):205-218
[12] Kim W.Web data stores (aka NoSQL databases):a data model and data management perspective [J].International Journal of Web and Grid Services,2014,10(1):100-110

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!