Computer Science ›› 2017, Vol. 44 ›› Issue (6): 250-254.doi: 10.11896/j.issn.1002-137X.2017.06.043

Previous Articles     Next Articles

Improved Apriori Algorithm and Its Application Based on MapReduce

ZHAO Yue, REN Yong-gong and LIU Yang   

  • Online:2018-11-13 Published:2018-11-13

Abstract: With the rapid development of mobile communications and Internet technology,it becomes one of the hot issues in the field of data mining that how to analyze the requirements of mobile users efficiently and send useful informations in time.In order to recommend the analysis result to users efficiently and timely,a mining method named MRS-Apriori algorithm based on MapReduce was proposed.This method defines a kind of coding rule to optimize database based on classical Apriori algorithm.A judging mark named Judgemark is added to database to decide whether the transaction database is frequent.This mechanism improves the efficiency of MRS-Apriroi algorithm in connecting database to scan database efficiently.On the basis of encoding rules,the MRS-Apriroi algorithm uses MapReduce programming framework model under Hadoop to achieve parallel processing.It improves the performance of iteration when connecting process and reduces the time in dealing with large-scale data.The experiment results show that MRS-Apriroi algorithm can effectively reduce time and have high accuracy in handling large data sets.

Key words: Coding rules,Association rules,Frequent itemsets,MapReduce framework

[1] HUANG Y B,CHEN M Y.Architecture Characteristics andAnalysis of Mobile Device Applications[J].Chinese Journal of Computers,2015,8(2):386-396.(in Chinese) 黄永兵,陈明宇.移动设备应用程序的体系结构特征分析[J].计算机学报,2015,8(2):386-396.
[2] MENG X W,HU X,WANG L C,et al.Mobile recommendersystems and their applications[J].Journal of Software,2013,4(1):91-108.(in Chinese) 孟祥武,胡勋,王立才,等.移动推荐系统及其应用[J].软件学报,2013,4(1):91-108.
[3] AGRAWAL R,IMIELIMSKI T,SWAMI A.Mining Associa-tion Rules between sets of items in large databases[C]∥Proceedings of the ACM SIGMOD Conference on Management of Data.Washington DC,1993:207-216.
[4] AGRAWA A,SRIKANT R.Fast algorithms for mining association rules[C]∥Proceedings of the VLDB International Confe-rence.1994:487-499.
[5] SCHLEGEL B,KIEFER T,KiSSINGER T.pcApriori:Scalable Apriori for Multiprocessor Systems[C]∥Proceedings of International Conference on Scientific and Statistical Database Ma-nagement.2013:1-12.
[6] GUO J,RENG Y G.Research on association rule mining inBook sales under cloud computing environment[J].Computer Applications and Software,2014,1(11):50-53.(in Chinese) 郭健,任永功.云计算环境下的关联规则挖掘在图书销售中的研究[J].计算机应用与软件,2014,1(11):50-53.
[7] LUO D,LI T S.Research on Improved Apriori Algorithm Based on Compressed Matrix[J].Computer Science,2013,0(12):75-80.(in Chinese) 罗丹,李陶深.一种基于压缩矩阵的Apriori算法改进研究[J].计算机科学,2013,0(12):75-80.
[8] WANG B L,SHEN Y G.Improvement of Apriori algorithm basedon boolean matrix[J].Adwanced Materials Research,2011,9:144-148.
[9] LIN M Y,LEE P Y,HSUEH S C.Apriori-based Frequent Itemset Mining Algorithm on Mapreduce[C]∥Proceedings of the 2nd International Conference on Ubiquitous Management and Communication.2012:1-8.
[10] LAZCORRETA E,BOTELLA F,FERNDEZ-CABALLEROA.Towards personalized recommendation by two-step modified Apriori data mining algorithm[J].Expert Systems with Applications,2008,5(3):1422-1429.
[11] TANG J W,WANG X F.Design and Implementation of Apriori on GPU[J].Computer Science,2014,1(10):238-243.(in Chinese) 唐家维,王晓峰.基于GPU的并行化Apriori算法的设计与实现[J].计算机科学,2014,1(10):238-243.
[12] LIU D Y,FENG J,LI X F.Logic-based Frequent SequentialPattern Mining Algorithm[J].Computer Science,2015,2(5):260-264.(in Chinese) 刘端阳,冯建,李晓粉.一种基于逻辑的频繁序列模式挖掘算法[J].计算机科学,2015,2(5):260-264.
[13] 韩家炜,等.数据挖掘概念与技术(第3版)[M].范明,等译.北京:机械工业出版社,2012:158-162.
[14] OLIVEIRA S R M,ZAIANE O R.A unified framework for protecting sensitive association rules in business collaboration [J].International Journal of Business Intelligence and Data Mining,2006,1(3):247-287.
[15] JEFFREY D,SANJAY G.Mapreduce:Simplified Data Proces-sing on Large Clusters[J].Proceedings of the Sixth Symposium on Operating System Design and Implementation,2004,1(1):107-113.

No related articles found!
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .