计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 220700200-8.doi: 10.11896/jsjkx.220700200
赵学健1,2, 赵可1
ZHAO Xuejian1,2, ZHAO Ke1
摘要: 精确频繁项集挖掘算法时间效率低下,在处理大规模数据集时力不从心。针对该问题,提出一种基于遗传算法的频繁项集挖掘策略GAA-FIM(Genetic Algorithm combining Apriori property based Frequent Itemset Mining),给出了编码操作、交叉操作、变异操作和选择操作的详细操作规则。该算法将遗传算法与精确频繁项集挖掘算法的向下闭包特性融合,改进了传统的有性繁殖的交叉操作方式,将具有良好遗传基因的个体优先加入到新一代候选种群中,并通过变异操作扩展新一代候选种群的规模,以提升算法的时间效率,获取更佳质量的频繁项集。基于合成数据集和真实数据集对GAA-FIM算法的性能进行了验证,实验结果表明GAA-FIM算法与GAFIM和GA-Apriori等算法相比具有更好的时间效率,频繁项集质量也得到了进一步提升。
中图分类号:
[1]CHEE C H,JAAFAR J,AZIZ I A,et al.Algorithms for fre-quent itemset mining:a literature review[J].Artificial Intelligence Review,2019,52(4):2603-2621. [2]VALIULLIN T,HUANG Z,WEI C,et al.A new approximate method for mining frequent itemsets from big data[J].Compu-ter Science and Information Systems,2021,18(3):641-656. [3]MATA J,ALVAREZ J L,RIQUELME J C.Mining numeric association rules with genetic algorithms[C]//Artificial Neural Nets and Genetic Algorithms.Springer,Vienna,2001:264-267. [4]MATA J,ALVAREZ J L,RIQUELME J C.An evolutionary algorithm to discover numeric association rules[C]//Proceedings of the 2002 ACM Symposium on Applied Computing.2002:590-594. [5]ALATA B,AKIN E.An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules[J].Soft Cmputing,2006,10(3):230-237. [6]YAN X,ZHANG C,ZHANG S.Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support[J].Expert Systems with Applications,2009,36(2):3066-3076. [7]DJENOURI Y,NOUALI-TABOUDJEMAT N,BENDJOUDI A.Association rules mining using evolutionary algorithms[C]//The 9th International Conference on Bio-inspired Computing:Theories and Applications(BIC-TA 2014).LNCS,2014. [8]DJENOURI Y,COMUZZI M.Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem[J].Information Sciences,2017,420:1-15. [9]BAGUI S,STANLEY P.Mining frequent itemsets from strea-ming transaction data using genetic algorithms[J].Journal of Big Data,2020,7(1):54. [10]FONG S,WONG R,VASILAKOS A V.Accelerated PSOswarm search feature selection for data stream mining big data[J].IEEE Transactions on Services Computing,2015,9(1):33-45. [11]KUO R J,LIN S Y,SHIH C W.Mining association rulesthrough integration of clustering analysis and ant colony system for health insurance database in Taiwan[J].Expert Systems with Applications,2007,33(3):794-808. [12]WU J M T,ZHAN J,LIN J C W.An ACO-based approach to mine high-utility itemsets[J].Knowledge-Based Systems,2017,116:102-113. [13]KUO R J,CHAO C M,CHIU Y T.Application of particleswarm optimization to association rule mining[J].Applied Soft Computing,2011,11(1):326-336. [14]LIN J C W,YANG L,FOURNIER-VIGER P,et al.Mininghigh-utility itemsets based on particle swarm optimization[J].Engineering Applications of Artificial Intelligence,2016,55:320-330. [15]DJENOURI Y,DRIAS H,HABBAS Z.Bees swarm optimization using multiple strategies for association rule mining[J].International Journal of Bio-Inspired Computation,2014,6(4):239-249. [16]HERAGUEMI K E,KAMEL N,DRIAS H.Multi-swarm batalgorithm for association rule mining using multiple cooperative strategies[J].Applied Intelligence,2016,45(4):1021-1033. [17]CAO H,YANG S,WANG Q,et al.A Closed Itemset Property based Multi-objective Evolutionary Approach for Mining Frequent and High Utility Itemsets[C]//2019 IEEE Congress on Evolutionary Computation(CEC).Wellington,New Zealand,2019:3356-3363. [18]DJENOURIY,DJENOURI D,BELHADI A,et al.A Novel Pa-rallel Framework for Metaheuristic-based Frequent Itemset Mining[C]//2019 IEEE Congress on Evolutionary Computation(CEC).Wellington,New Zealand,2019:1439-1445. [19]TIMUR V,HUANG Z J,WEI C,et al.A new approximatemethod for mining frequent itemsets from big data[J].Compu-ter Science and Information Systems,2021,18(3):641-656. [20]RAMESH D F,JEYASUTHA M.A Novel Fuzzy FrequentItemsets Mining Approach for the Detection of Breast Cancer[J].International Journal of Information Retrieval Research,2021,11(1):36-53. [21]FATEMI S M,HOSSEINI S M,KAMANDI A,et al.CL-MAX:a clustering-based approximation algorithm for mining maximal frequent itemsets[J].International Journal of Machine Learning and Cybernetics,2021,12:365-383. [22]YU X,ZHAO J,WANG H,et al.A model of mining approximate frequent itemsets using rough set theory[J].International Journal of Computational Science and Engineering(IJCSE),2019,19(1):71-82. [23]WU T,LIN J,YUN J,et al.An efficient algorithm for fuzzy fre-quent itemset mining[J].Journal of Intelligent & Fuzzy Systems,2020,38(5):5787-5797. [24]VALIULLIN T,HUANG Z,WEI C,et al.A new approximate method for mining frequent itemsets from big data[J].ComputerScience and Information Systems,2021,18(3):641-656. |
|