计算机科学 ›› 2022, Vol. 49 ›› Issue (6A): 125-132.doi: 10.11896/jsjkx.210600135

• 智能计算 • 上一篇    下一篇

混合改进的花授粉算法与灰狼算法用于特征选择

康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩   

  1. 云南大学软件学院 昆明 650500
  • 出版日期:2022-06-10 发布日期:2022-06-08
  • 通讯作者: 李浩(493895015@qq.com)
  • 作者简介:(kangyan@ynu.edu.cn)
  • 基金资助:
    云南省科技厅重大专项(2019ZE001-1,202002AB080001-6)

Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection

KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao   

  1. School of Software,Yunnan University,Kunming 650500,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:KANG Yan,born in 1972,postgraduate supervisor,is a member of China Computer Federation.Her main research interests include machine learning and software engineering.
    LI Hao,born in 1970,postgraduate supervisor,is a member of China ComputerFederation.His main research interests include machine learning and software engineering.
  • Supported by:
    Yunnan Provincial Science and Technology Department Major Special Projects(2019ZE001-1,202002AB080001-6).

摘要: 特征选择在数据预处理阶段中极为重要。特征选择的优劣不仅影响着神经网络训练的时间长短,更影响神经网络性能的好坏。灰狼改进花授粉算法(Grey Wolf Improved Flower Pollination Algorithm,GIFPA)是一种基于花授粉算法(Flower Pollination Algorithm,FPA)框架与灰狼优化算法融合的混合算法,将其应用于特征选择问题,既可以保留原始特征的内涵信息,又可以最大化分类特征的准确率。GIFPA算法在花授粉算法的异花授粉阶段中加入了最差个体信息,并用作全局搜索,将灰狼优化算法中的狩猎过程作为局部搜索,并且通过转换系数来调节二者的搜索过程。同时,为了克服群智能算法易陷入局部最优的问题,首次采用数据挖掘领域中的RelifF算法,通过RelifF算法过滤出高权重特征并用于改进最佳个体信息。为了验证算法的性能,实验选取UCI数据库中21个领域的经典数据集进行测试,利用K近邻(KNN)分类器进行分类测评,以适应度值和准确率作为评价标准,并通过K-折交叉验证来克服过拟合问题。实验选择了包括FPA算法在内的多种经典算法和先进算法进行比较,结果表明GIFPA算法在特征选择问题上有很强的竞争力。

关键词: FPA算法, RelifF, 灰狼算法, 特征选择, 优化器

Abstract: Feature selection is very important in the stage of data preprocessing.The quality of feature selection not only affects the training time of the neural network but also affects the performance of the neural network.Grey Wolf improved Flower pollination algorithm(Grey Wolf improved Flower pollination algorithm,GIFPA) is a hybrid algorithm based on the fusion of flower pollination algorithm framework and gray wolf optimization algorithm.When it is applied to feature selection,it can not only retain the connotation information of the original features but also maximize the accuracy of classification features.The GIFPA algorithm adds the worst individual information to the FPA algorithm,uses the cross-pollination stage of the FPA algorithm as the global search,uses the hunting process of the gray wolf optimization algorithm as the local search,and adjusts the search process of the two through the conversion coefficient.At the same time,to overcome the problem that swarms intelligence algorithm is easy to fall into local optimization,this paper uses the RelifF algorithm in the field of data mining to improve this problem and uses the RelifF algorithm to filter out high weight features and improve the best individual information.To verify the performance of the algorithm,21 classical data sets in the UCI database are selected for testing,k-nearest neighbor(KNN) classifier is used for classification and evaluation,fitness value and accuracy are used as evaluation criteria,and K-fold crossover verification is used to overcome the over-fitting problem.In the experiment,a variety of classical algorithms and advanced algorithms,including the FPA algorithm,are compared.The experimental results show that the GIFPA algorithm has strong competitiveness in feature selection.

Key words: Feature selection, FPA, GWO, Optimizer, RelifF

中图分类号: 

  • TP391
[1] ZAWBAA H M,EMARY E,GROSAN C.Feature Selection via Chaotic Antlion Optimization[J].Plos One,2016,11(3):e0150652.
[2] JIN X M,HUA W Q.Resource Management for Mobile Cloud Computing Energy Consumption Optimization[J].Computer Science,2020,47(6):253-257.
[3] LIU Y,CHAI Y,LIU B,et al.Bearing Fault Diagnosis Based on Energy Spectrum Statistics and Modified Mayfly Optimization Algorithm[J].Sensors,2021,21(6):2245.
[4] RAVI K,MALLIDI S,SANTOSH J K,et al.Bat optimizationalgorithm for wrapper-based feature selection and performance improvement of android malware detection[J].IET Networks,2021:1-10.
[5] FENG Y,WANG G G,DEB S,et al.Solving 0-1 knapsack problem by a novel binary monarch butterfly optimization[J].Neural Computing and Applications,2017,28(7):1-16.
[6] SALEHI M,FARHADI S,MOIENI A,et al.A hybrid modelbased on general regression neural network and fruit fly optimization algorithm for forecasting and optimizing paclitaxel biosynthesis in Corylus avellana cell culture[J].Plant Methods,2021,17(1):13.
[7] TUBISHAT M,JA'AFAR S,ALSWAITTI M,et al.Dynamic Salp Swarm Algorithm for Feature Selection[J].Expert Systems with Applications,2020,147:113873.
[8] ARORA S,ANAND P.Binary butterfly optimization approaches for feature selection[J].Expert Systems with Application,2019,116(FEB.):147-160.
[9] BHATTACHARYYA T,CHATTERJEE B,SINGH P K,et al.Mayfly in Harmony:A New Hybrid Meta-Heuristic Feature Selection Algorithm[J].IEEE Access,2020,8:195929-195945.
[10] WANG D,CHEN H,LI T,et al.A novel quantum grasshopper optimization algorithm for feature selection[J].International Journal of Approximate Reasoning,2020,127:33-53.
[11] CHEN H W,HU Z,HAN L,et al.A Spark-based Distributed Whale Optimization Algorithm for Feature Selection[C]//The 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems:Technology and Applications.IEEE,2019:70-74.
[12] FARIS H,MAFARJA M M,HEIDARI A A,et al.An Efficient Binary Salp Swarm Algorithm with Crossover Scheme for Feature Selection Problems[J].Knowledge-Based Systems,2018,154(Aug.15):43-67.
[13] YAN C,MA J,LUO H,et al.A hybrid algorithm based on binary chemical reaction optimization and tabu search for feature selection of high-dimensional biomedical data[J].Tsinghua Science and Technology,2018,23(6):733-743.
[14] SHI L,WAN Y C,GAO X J,et al.Feature Selection for Object-Based Classification of High-Resolution Remote Sensing Images Based on the Combination of a Genetic Algorithm and Tabu Search[J].Computational Intelligence and Neuroscience,2018,2018.
[15] WANG M,LIN J,YUE L,et al.Compensation for mobile ca-rrier magnetic interference in a SQUID-based full-tensor magnetic gradiometer using the flower pollination algorithm[J].Measurement Science and Technology,2021,32(8):085010.
[16] POA B,SC A,CYT A,et al.Prediction of tea theanine content using near-infrared spectroscopy and flower pollination algorithm-ScienceDirect[J].Spectrochimica Acta Part A:Molecular and Biomolecular Spectroscopy,2021,255.
[17] ANDERSEN C M,BRO R.Practical aspects of PARAFAC mo-deling of fluorescence excitation-emission data[J].Journal of Chemometrics,2010,17(4):200-215.
[18] JUNG D.Distributed Feature Selection for Multi-Class Classification Using ADMM[J].IEEE Control Systems Letters,2020,5(3):821-826.
[19] CHANDRASHEKAR G,SAHIN F.A survey on feature selection methods[J].Computers & Electrical Engineering,2014,40(1):16-28.
[20] JIMÉNEZ-CORDERO A,MORALES J M,PINEDA S.A novel embedded min-max approach for feature selection in nonlinear Support Vector Machine classification[J].European Journal of Operational Research,2021,293(1):24-35.
[21] SM A,SMM B,AL A.Grey Wolf Optimizer[J].Advances in Engineering Software,2014,69:46-61.
[22] SUNNY S,JAYARAJ P B.FPDock:Protein-Protein DockingUsing Flower Pollination Algorithm[J].Computational Biology and Chemistry,2021,93(2):107518.
[23] RAO R V.Jaya:A simple and new optimization algorithm forsolving constrained and unconstrained optimization problems[J].International Journal of Industrial Engineering Computations,2016,7(1934):19-34.
[24] YANG X S.Flower Pollination Algorithm for Global Optimization[C]//International Conference on Unconventional Computing and Natural Computation.Berlin:Springer,2012:240-249.
[25] MIRJALILI S,MIRJALILI S M,YANG X S.Binary bat algorithm[J].Neural Computing & Applications,2014,25(3/4):663-681.
[26] MIRJALILI S,MIRJALILI S M,HATAMLOU A.Multi-Verse Optimizer:a nature-inspired algorithm for global optimization[J].Neural Computing and Applications,2015,27(2):495-513.
[27] SOUZA R,COELHO L,MACEDO C,et al.A V-Shaped Binary Crow Search Algorithm for Feature Selection[C]//2018 IEEE Congress on Evolutionary Computation(CEC).IEEE,2018:1-8.
[28] HUSSIEN A G,HASSANIEN A E,HOUSSEIN E H,et al.S-shaped Binary Whale Optimization Algorithm for Feature Selection[M]//Recent Trends in Signal and Image Processing.Singapore:Springer,2019:79-87.
[29] MAFARJA M,JARRAR R,AHMAD S,et al.Feature Selection Using Binary Particle Swarm Optimization with Time Varying Inertia Weight Strategies[C]//International Conference on Future Networks & Distributed Systems.2018:1-9.
[30] ABDEL-BASSET M,EL-SHAHAT D,EL-HENAWY I,et al.A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection[J].Expert Systems with Application,2020,139(Jan.):112824.1-112824.14.
[1] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[2] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[3] 范星泽, 禹梅.
改进灰狼算法的无线传感器网络覆盖优化
Coverage Optimization of WSN Based on Improved Grey Wolf Optimizer
计算机科学, 2022, 49(6A): 628-631. https://doi.org/10.11896/jsjkx.210500037
[4] 储安琪, 丁志军.
基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理
Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation
计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[5] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[6] 李宗然, 陈秀宏, 陆赟, 邵政毅.
鲁棒联合稀疏不相关回归
Robust Joint Sparse Uncorrelated Regression
计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[7] 张叶, 李志华, 王长杰.
基于核密度估计的轻量级物联网异常流量检测方法
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
[8] 杨蕾, 降爱莲, 强彦.
基于自编码器和流形正则的结构保持无监督特征选择
Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization
计算机科学, 2021, 48(8): 53-59. https://doi.org/10.11896/jsjkx.200700211
[9] 侯春萍, 赵春月, 王致芃.
基于自反馈最优子类挖掘的视频异常检测算法
Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining
计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146
[10] 胡艳梅, 杨波, 多滨.
基于网络结构的正则化逻辑回归
Logistic Regression with Regularization Based on Network Structure
计算机科学, 2021, 48(7): 281-291. https://doi.org/10.11896/jsjkx.201100106
[11] 周钢, 郭福亮.
基于特征选择的高维数据集成学习方法研究
Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data
计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102
[12] 丁思凡, 王锋, 魏巍.
一种基于标签相关度的Relief特征选择算法
Relief Feature Selection Algorithm Based on Label Correlation
计算机科学, 2021, 48(4): 91-96. https://doi.org/10.11896/jsjkx.200800025
[13] 滕俊元, 高猛, 郑小萌, 江云松.
噪声可容忍的软件缺陷预测特征选择方法
Noise Tolerable Feature Selection Method for Software Defect Prediction
计算机科学, 2021, 48(12): 131-139. https://doi.org/10.11896/jsjkx.201000168
[14] 张亚钏, 李浩, 宋晨明, 卜荣景, 王海宁, 康雁.
混合人工化学反应优化和狼群算法的特征选择
Hybrid Artificial Chemical Reaction Optimization with Wolf Colony Algorithm for Feature Selection
计算机科学, 2021, 48(11A): 93-101. https://doi.org/10.11896/jsjkx.210100067
[15] 全艺璇, 郑嘉利, 罗文聪, 林子涵, 谢孝德.
基于改进型灰狼算法的RFID网络规划
Improved Grey Wolf Optimizer for RFID Network Planning
计算机科学, 2021, 48(1): 253-257. https://doi.org/10.11896/jsjkx.200200095
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!