计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 277-291.doi: 10.11896/jsjkx.220300269

• 人工智能 • 上一篇    下一篇


孙林, 李梦梦, 徐久成   

  1. 河南师范大学计算机与信息工程学院 河南 新乡 453007
    智慧商务与物联网技术河南省工程实验室 河南 新乡 453007
  • 收稿日期:2022-03-29 修回日期:2022-09-26 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 孙林(sunlin@htu.edu.cn)
  • 基金资助:

Binary Harris Hawk Optimization and Its Feature Selection Algorithm

SUN Lin, LI Mengmeng, XU Jiucheng   

  1. College of Computer and Information Engineering,Henan Normal University,Xinxiang,Henan 453007,China
    Engineering Lab of Intelligence Business and Internet of Things of Henan Province,Xinxiang,Henan 453007,China
  • Received:2022-03-29 Revised:2022-09-26 Online:2023-05-15 Published:2023-05-06
  • About author:SUN Lin,born in 1979,Ph.D,associate professor,master supervisor.His main research interests include granular computing,big data mining,machine lear-ning and bioinformatics.
  • Supported by:
    National Natural Science Foundation of China(62076089,61976082,62002103,61901160),Key Science and Technology Program of Henan Province,China(212102210136,222102210169) and Key Scientific Research Project of Henan Provincial Higher Education of China(22B520013).

摘要: 针对哈里斯鹰优化(Harris Hawk Optimization,HHO)算法在探索阶段仅使用随机策略初始种群,致使种群多样性下降,控制开发和探索过程中的线性变化的逃逸能量,在迭代后期易陷入局部最优等问题,提出了二进制HHO及其元启发式特征选择算法。首先,在探索阶段引入Sine映射函数,初始化哈里斯鹰种群位置,运用自适应调整算子来改变HHO搜索范围,并更新HHO的种群位置。其次,利用对数惯性权重改进逃逸能量的更新公式,将迭代次数引入跳跃距离中,使用步长调整参数调整HHO的搜索距离,进而平衡探索与开发能力;在此基础上设计了改进的HHO算法,避免HHO算法陷入局部最优。然后,引入S型和V型传递函数,更新改进的HHO算法的二进制位置和种群位置,设计了两种二进制的改进HHO算法。最后,使用适应度函数评估特征子集,并将二进制改进HHO算法与适应度函数相结合,提出了两种基于二进制的改进HHO元启发式特征选择算法。在10个基准函数和17个公共数据集上的实验结果表明,4种优化策略在10个基准函数上有效提升了HHO算法的优化性能,改进的HHO算法明显优于对比的其他优化算法;在12个UCI数据集和5个高维基因数据集上,将所提算法与基于BHHO的特征选择算法和其他特征选择算法进行比较,实验结果显示,基于V型改进的HHO特征选择算法具备良好的寻优能力与分类性能。

关键词: 特征选择, 元启发式, 二进制, 哈里斯鹰优化, 适应度函数

Abstract: Harris Hawk optimization(HHO) algorithm only uses the random strategy to initialize the population in the exploration stage,which decreases the population diversity.The escape energy that controls the linear variation of the development and exploration process is prone to fall into local optimum in the later stage of iteration.To address the issues,this paper proposes a binary Harris Hawk optimization for metaheuristic feature selection algorithm.First,in the exploration phase,the Sine mapping function is introduced to initialize the population location of Harris Hawk,and the adaptive adjustment operator is used to change the search range of HHO and update the population location of HHO.Second,the updated formula of escape energy is improved by the logarithmic inertia weight,the number of iterations are introduced into the jump distance,and the step size adjustment parameter is employed to adjust the search distance of HHO to balance the exploration and development capabilities.On this basis,an improved HHO algorithm is designed to avoid the HHO algorithm falling into the local optimum.Third,the binary position and population position of the improved HHO algorithm are updated by the S-type and V-type transfer functions.Thus two binary improved HHO algorithms are designed.Finally,a fitness function is used to evaluate the feature subset,the binary improved HHO algorithm is combined with this fitness function,and then two binary improved HHO metaheuristic feature selection algorithms are developed.Experimental results on 10 benchmark functions and 17 public datasets show that the four optimization strategies effectively improve the optimization performance of the HHO algorithms on these benchmark functions,and the improved HHO algorithm is significantly better than other compared optimization algorithms.On 12 UCI datasets and 5 high-dimensional gene datasets,when compared with the BHHO-based feature selection algorithms and the other feature selection algorithms,the results demonstrate that the V-shape-based improved HHO feature selection algorithm has great optimization ability and classification performance.

Key words: Feature selection, Metaheuristic, Binary, Harris Hawk optimization, Fitness function


  • TP181
[1]SUN L,LI M M,DING W P,et al.AFNFS:Adaptive fuzzy neighborhood-based feature selection with adaptive synthetic over-sampling for imbalanced data[J].Information Sciences,2022,612:724-744.
[2]TENG J Y,GAO M,ZHENG X M,et al.Noise tolerable feature selection method for software defect prediction[J].Computer Science,2021,48(12):131-139.
[3]LIU Y,CHENG L,SUN L.Feature selection method based on K-S test and neighborhood rough sets[J].Journal of Henan Normal University(Natural Science Edition),2019,47(2):21-28.
[4]SUN L,HUANG M M,XU J C.Weak label feature selectionmethod based on neighborhood rough sets and relief[J].Computer Science,2022,49(4):152-160.
[5]RUAN Z H,XIAO X Y,HU W X,et al.Multiple power quality disturbance classification feature optimization based on multi-granularity feature selection and model fusion[J].Power System Protection and Control,2022,50(14):1-10.
[6]SUN L,YIN T Y,DING W P,et al.Feature selection with mis-sing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy[J].IEEE Transactions on Fuzzy Systems,2022,30(5):1197-1211.
[7]LIU Q Z,HUANG C S.Optimization of dielectric responsecharacteristics of oil paper insulation based on FCBF feature selection and the XGBoost principle[J].Power System Protection and Control,2022,50(15):50-59.
[8]SUN L,ZHAO J,XU J C,et al.Feature selection method based on improved monarch butterfly optimization algorithm[J].Pattern Recognition and Artificial Intelligence,2020,33(11):981-994.
[9]DEHKORDI A A,SADIQ A S,MIRJALILI S,et al.Nonlinear-based chaotic harris hawks optimizer:algorithm and internet of vehicles application[J].Applied Soft Computing,2021,109:107574.
[10]QIAO B J,ZHANG J X,ZUO X Y.A heuristic independent taskscheduling algorithm based on task execution time[J].Journal of Henan Normal University(Natural Science Edition),2022,50(5):19-28.
[11]KANG Y,WANG H N,TAO L,et al.Hybrid improved flower pollination algorithm and gray wolf algorithm for feature selection[J].Computer Science,2022,49(S1):125-132.
[12]ZHANG L.Credit evaluation model of small and medium-sized enterprises based on HGA-SVM[J].Journal of Henan Normal University(Natural Science Edition),2022,50(2):79-85.
[13]YANG H Z,TIAN F M,ZHANG P,et al.Short-term load forecasting based on CEEMD-FE-AOA-LSSVM[J].Power System Protection and Control,2022,50(13):126-133.
[14]LI B X,WAN R Z,ZHU Y J,et al.Multi-strategy comprehensive article swarm optimization algorithm based on population partition[J].Journal of Henan Normal University(Natural Science Edition),2022,50(3):85-94.
[15]MIRJALILI S.The ant lion optimizer[J].Advances in Enginee-ring Software,2015,83:80-98.
[16]SUN L,WANG X Y,DING W P,et al.TSFNFR:Two-stage fuzzy neighborhood-based feature reduction with binary whale optimization algorithm for imbalanced data classification[J/OL].Knowledge-Based Systems,2022,109849.https://doi.org/10.1016/j.knosys.2022.109849.
[17]JI B,LU X Z,SUN G,et al.Bio-inspired feature selection:An improved binary particle swarm optimization approach[J].IEEE Access,2020,8:85989-86002.
[18]MAFARJA M,ALJARAH I,FARIS H,et al.Binary grasshopper optimisation algorithm approaches for feature selection problems[J].Expert Systems with Applications,2019,117:267-286.
[19]EMARY E,ZAWBAA H M,HASSANIEN A E.Binary grey wolf optimization approaches for feature selection[J].Neurocomputing,2016,172:371-381.
[20]HEIDARI A A,MIRJALILI S,FARIS H,et al.Harris hawks optimization:Algorithm and applications[J].Future Generation Computer Systems,2019,97:849-872.
[21]TANG A D,HAN T,XU D W,et al.Chaotic elite harris hawk optimization algorithm[J].Journal of Computer Applications,2021,41(8):2265-2272.
[22]THAHER T,HEIDARI A A,MAFARJA M,et al.Binary harris hawks optimizer for high-dimensional,low sample size feature selection[M]//Evolutionary Machine Learning Techniques.Algorithms for Intelligent Systems.Singapore:Springer,2020:251-272.
[23]ABDEL M,DING W,EL D.A hybrid harris hawks optimization algorithm with simulated annealing for feature selection[J].Artificial Intelligence Review,2021,54(1):593-637.
[24]HUSSAIN K,NEGGAZ N,ZHU W,et al.An efficient hybrid sine-cosine Harris Hawks optimization for low and high-dimensional feature selection[J].Expert Systems with Applications,2021,176:114778.
[25]YANG X S.Nature-inspired metaheuristic algorithms[M].Luniver Press,2010.
[26]BAN D H,LV X,WANG X Y.Efficient image encryption algorithm based on 1D chaotic map[J].Computer Science,2020,47(4):278-284.
[27]FANG J Y,JI Y S,ZHAO X C.Opposition-based differentialevolution algorithm with Gaussian distribution estimation[J].Journal of Henan Normal University(Natural Science Edition),2021,49(3):27-32.
[28]CHEN L,YIN J S.Whale Swarm Optimization Algorithm based on gaussian difference mutation and logarithmic inertia weight[J].Computer Engineering and Applications,2021,57(2):77-90.
[29]KENNEDY J,EBERHART R C.A discrete binary version of the particle swarm algorithm[C]//Proceedings of the 1997 IEEE International Conference on Systems,Man,and Cyberne-tics.Computational Cybernetics and Simulation,1997:4104-4108.
[30]HU P,PAN J S,CHU S C.Improved binary grey wolf optimizer and its application for feature selection[J].Knowledge-Based Systems,2020,195:105746.
[31]RASHEDI E,NEZAMABADI-POUR H,SARYAZDI S.BGSA:Binary gravitational search algorithm[J].Natural Computing,2010,9(3):727-745.
[32]THOM DE SOUZA R C,DE MACEDO C A,DOS SANTOSCOELHO L,et al.Binary coyote optimization algorithm for feature selection[J].Pattern Recognition,2020,107:107470.
[33]HASHIM F A,HOUSSEIN E H,MABROUK M S,et al.Henry gas solubility optimization:A novel physics-based algorithm[J].Future Generation Computer Systems,2019,101:646-667.
[34]FARAMARZI A,HEIDARINEJAD M,MIRJALILI S,et al.Marine predators algorithm:A nature-inspired metaheuristic[J].Expert Systems with Applications,2020,152:113377.
[35]SHAREEF H,IBRAHIM A A,MUTLAG A H.Lightningsearch algorithm[J].Applied Soft Computing Journal,2015,36:315-333.
[36]ABD ELAZIZ M,YANG H,LU S.A multi-leader harris hawk optimization based on differential evolution for feature selection and prediction influenza viruses H1N1[J].Artificial Intelligence Review,2022,55:2675-2732.
[37]LIU J J,WU C Z,CAO J,et al.A binary differential search algorithm for the 0-1 multidimensional knapsack problem[J].Applied Mathematical Modelling,2016,40(23/24):9788-9805.
[38]CHENG S,QIN Q D,CHEN J F,et al.Brain storm optimization algorithm:A review[J].Artificial Intelligence Review,2013,46(4):445-458.
[39]MIRJALILI S,MIRJALILI S M,LEWIS A.Grey wolf optimizer[J].Advances in Engineering Software,2014,69:46-61.
[40]MIRJALILI S.Dragonfly algorithm:A new meta-heuristic optimization technique for solving single-objective,discrete,and multi-objective problems[J].Neural Computing and Applications,2016,27:1053-1073.
[41]SUN L,WANG L Y,QIAN Y H,et al.Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems[J].Knowledge-Based Systems,2019,186:104942.
[42]JIANG L X,ZHANG L G,LI C Q,et al.A correlation-based feature weighting filter for naive bayes[J].IEEE Transactions on Knowledge and Data Engineering,2019,31(2):201-213.
[43]XIONG X X,WANG W W.Kernelized correlation filteringmethod based onfast discriminative scale estimation[J].Journal of Computer Applications,2019,39(2):546-550.
[44]LI H L,MENG Z Q.Attribute reduction algorithm using information gain and inconsistency to fill[J].Computer Science,2018,45(10):217-224.
[45]XIE J Y,DING L J,WANG M Z.Spectral clustering based unsupervised feature selection algorithms[J].Journal of Software,2020,31(4):1009-1024.
[46]SUN L,ZHANG J X,DING W P,et al.Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors[J].Information Sciences,2022,593:591-613.
[47]SUN L,WANG T X,DING W P,et al.Two-stage-neighborhood-based multilabel classification for incomplete data with missing labels[J].International Journal of Intelligent Systems,2022,37:6773-6810.
[1] 王泰彦, 潘祖烈, 于璐, 宋景彬.
Binary Code Similarity Detection Method Based on Pre-training Assembly Instruction Representation
计算机科学, 2023, 50(4): 288-297. https://doi.org/10.11896/jsjkx.220300271
[2] 陈奕君, 高浩然, 丁志军.
Credit Evaluation Model Based on Dynamic Machine Learning
计算机科学, 2023, 50(1): 59-68. https://doi.org/10.11896/jsjkx.220800191
[3] 胡安祥, 尹小康, 朱肖雅, 刘胜利.
Strcmp-like Function Identification Method Based on Data Flow Feature Matching
计算机科学, 2022, 49(9): 326-332. https://doi.org/10.11896/jsjkx.220200163
[4] 李斌, 万源.
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[5] 胡艳羽, 赵龙, 董祥军.
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[6] 康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩.
Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection
计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135
[7] 储安琪, 丁志军.
Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation
计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[8] 孙林, 黄苗苗, 徐久成.
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[9] 李宗然, 陈秀宏, 陆赟, 邵政毅.
Robust Joint Sparse Uncorrelated Regression
计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[10] 吕小少, 舒辉, 康绯, 黄宇垚.
Reverse Location of Software Online Upgrade Function Based on Semantic Guidance
计算机科学, 2022, 49(12): 353-361. https://doi.org/10.11896/jsjkx.211000059
[11] 王盼红, 朱昌明.
MIF-CNNIF:A Multi-classification Image Data Framework Based on CNN with Intersect Features
计算机科学, 2022, 49(11A): 210800267-8. https://doi.org/10.11896/jsjkx.210800267
[12] 俞赛赛, 王小娟, 章倩倩.
Detection of Malicious Behavior in Encrypted Traffic Based on Heuristic Search Feature Selection
计算机科学, 2022, 49(11A): 210800237-6. https://doi.org/10.11896/jsjkx.210800237
[13] 李永红, 汪盈, 李腊全, 赵志强.
Application of Improved Feature Selection Algorithm in Spam Filtering
计算机科学, 2022, 49(11A): 211000028-5. https://doi.org/10.11896/jsjkx.211000028
[14] 闫振超, 舒文豪, 谢昕.
Incremental Feature Selection Algorithm for Dynamic Partially Labeled Hybrid Data
计算机科学, 2022, 49(11): 98-108. https://doi.org/10.11896/jsjkx.210900076
[15] 张叶, 李志华, 王长杰.
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
Full text



No Suggested Reading articles found!